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NOVEL NUCLEIC ACIDS AND POLYPEPTIDES 



1. TECHNICAL FIELD 

The present invention provides novel polynucleotides and proteins encoded by such 
5 polynucleotides, along with uses for these polynucleotides and proteins, for example in 
therapeutic, diagnostic and research methods. 

2. BACKGROUND 

Technology aimed at the discovery of protein factors (including e.g., cytokines, such as 

1 0 ly mphokines, interferons, CSFs, chemokines, and interleukins) has matured rapidly over the past 
decade. The now routine hybridization cloning and expression cloning techniques clone novel 
polynucleotides "direcdy" in the sense that they rely on information directly related to the 
discovered protein (i.e., partial DNA/amino acid sequence of the protein in the case of 
hybridization cloning; activity of the protein in the case of expression cloning). More recent 

15 "indirect" cloning techniques such as signal sequence cloning, which isolates DNA sequences 
based on the presence of a now well-recognized secretory leader sequence motif, as well as 
various PCR-based or low stringency hybridization-based cloning techniques, have advanced the 
state of the art by making available large numbers of DNA/amino acid sequences for proteins 
that are known to have biological activity, for example, by virtue of their secreted nature in the 

20 case of leader sequence cloning, by virtue of their cell or tissue source in the case of PCR-based 
techniques, or by virtue of structural similarity to other genes of known biological activity. 

Identified polynucleotide and polypeptide sequences have numerous applications in, fcr 
example, diagnostics, forensics, gene mapping; identification of mutations responsible for 
genetic disorders or other traits, to assess biodiversity, and to produce many other types of data 

25 and products dependent on DNA and amino acid sequences. 



3. SUMMARY OF THE INVENTION 

The compositions of the present invention include novel isolated polypeptides, novel 
isolated polynucleotides encoding such polypeptides, including recombinant DNA molecules, 
30 cloned genes or degenerate variants thereof, especially naturally occurring variants such as allelic 
variants, antisense polynucleotide molecules, and antibodies that specifically recognize one or more 
epitopes present on such polypeptides, as well as hybridomas producing such antibodies. 

The compositions of the present invention additionally include vectors, including expression 
vectors, containing the polynucleotides of the invention, cells genetically engineered to contain such 
35 polynucleotides and cells genetically engineered to express such polynucleotides. 



tit JSDCGID: -.WO. 



.OI53312A1 \ ^ 



PCT/USOO/34263 

WO 01/53312 . 

The present invention relates to a collection or library of at least one novel nucleic acid 

sequence assembled from expressed sequence tags (ESTs) isolated mainly by sequencing by 
hybridization (SBH), and in some cases, sequences obtained from one or more public databases. 
The invention relates also to the proteins encoded by such polynucleotides, along with therapeutic, 
5 diagnostic and research utilities for these polynucleotides and proteins. These nucleic acid 

sequences are designated as SEQIDNO: 1-1786 and 3573-5358. The polypeptides sequences are 
designated SEQ ID NO: 2n (wherein n = 1 to 20). The nucleic acids and polypeptides are provided 
in the Sequence Listing. In the nucleic acids provided in the Sequence Listing, A is adenosine; C is 
cytosine; G is guanine; T is thymine; and N is any of the four bases. In the amino acids provided in 
1 0 the Sequence Listing, * corresponds to the stop codom 

The nucleic acid sequences of the present invention also include, nucleic acid sequences that 
h y bridizetothecomplementofSEQIDNO:l-1786and3573-5358 under stringent hybridization 
conditions; nucleic acid sequences which are allelic variants or species homologues of any of the 
nucleic acid sequences recited above, or nucleic acid sequences that encode a peptide comprising a 
1 5 specific domain or truncation of the peptides encoded by SEQ ID NO:l -1786 and 3573-53 58 A 
polynucleotide comprising a nucleotide sequence having at least 90% identity to an identifying 
sequence of SEQ IDNO:l-1786 and 3573-5358 or a degenerate variant or fragment thereof. The 

identifying sequence can be 1 00 base pairs in length. 

The nucleic acid sequences of the present invention also include the sequence information 
20 from the nucleic acid sequences of SEQ ID NO: 1 -1 786 and 3573-5358 . The sequence information 
can be a segment of any one of SEQ ID NO-.1-1786 and 3573-535 8 that uniquely identifies or 
represents the sequence information of SEQ ID NO: 1-1 786 and 3573-5358. 

A collection as used in this application can be a collection of only one polynucleotide. The 
collection of sequence information or identifying information of each sequence can be provided on 
25 a nucleic acid array. In one embodiment, segments of sequence information is provided on a 

nucleic acid array to detect the polynucleotide that contains the segment. The array can be designed 
to detect full-match or mismatch to the polynucleotide that contains the segment. The collection 
can also be provided in a computer-readable format. 

This invention also includes the reverse or direct complement of any of the nucleic acid 
30 sequences recited above; cloning or expression vectors containing the nucleic acid sequences; and 
host cells or organisms transformed with these expression vectors. Nucleic acid sequences (or their 
reverse or direct complements) accordingto the inventionhave numerous applications in a variety 
of techniques known to those skilled in the art of molecular biology, such as use as hybridization 
probes, use as primers for PCR, use in an array, use in computer-readable media, use in sequencing 
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full-length genes, use for chromosome and gene mapping, use in the recombinant production of 
protein, and use in the generation of anti-sense DNA or RNA, their chemical analogs and the like. 

In a preferred embodiment, the nucleic acid sequences of SEQ ID NO: 1-1786 and 3573- 
5358 or novel segments or parts of the nucleic acids of the invention are used as primers in 
5 expression assays that are well known in the art. In a particularly preferred embodiment, the nucleic 
acid sequences of SEQ ID NO: 1 -1 786 and 3573-5358 or novel segments or parts of the nucleic 
acids provided herein are used in diagnostics for identifying expressed genes or, as well known in 
the art and exemplified by Vollrath et al., Science 258:52-59 (1992), as expressed sequence tags for 
physical mapping of the human genome. 

1 0 The isolated polynucleotides of the invention include, but are not limited to, a 

polynucleotide comprising any one of the nucleotide sequences set forth in SEQ ID NO: 1 -1 786 and 
3573-5358; a polynucleotide comprising any of the full length protein coding sequences of SEQ ID 
NO: 1 -1 786 and 3573-5358; and a polynucleotide comprising any of the nucleotide sequences of the 
mature protein coding sequences of SEQ ID NO: 1 -1 786 and 3573-5358. The polynucleotides of the 
1 5 present invention also include, but are not limited to, a polynucleotide that hybridizes under 

stringent hybridization conditions to (a) the complement of any one of the nucleotide sequences set 
forth in SEQ ID NO: 1-1786 and 3573-5358; (b) a nucleotide sequence encoding any one of the 
amino acid sequences set forth in the Sequence Listing; (c) a polynucleotide which is an allelic 
variant of any polynucleotides recited above; (d) a polynucleotide which encodes a species homolog 
20 (e.g. orthologs) of any of the proteins recited above; or (e) a polynucleotide that encodes a 

polypeptide comprising a specific domain or truncation of any of the polypeptides comprising an 
amino acid sequence set forth in the Sequence Listing. 

The isolated polypeptides of the invention include, but are not limited to, a polypeptide 
comprising any of the amino acid sequences set forth in the Sequence Listing; or the corresponding 
full length or mature protein. Polypeptides of the invention also include polypeptides with biological 
activity that are encoded by (a) any of the polynucleotides having a nucleotide sequence set forth in 
SEQ ID NO: 1-1 786 and 3573-5358; or (b) polynucleotides that hybridize to the complement of the 
polynucleotides of (a) under stringent hybridization conditions. Biologically or immunologically 
active variants of any of the polypeptide sequences in the Sequence Listing, and "substantial 
30 equivalents" thereof (e.g., with at least about 65%, 70%, 75%, 80%, 85%, 90%, 95%, 98% or 99% 
amino acid sequence identity) that preferably retain biological activity are also contemplated. The 
polypeptides of the invention may be wholly or partially chemically synthesized but are preferably 
produced by recombinant means using the genetically engineered cells (e.g. host eel Is) of the 
invention. 



25 
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The invention also provides compositions comprising a polypeptide of the invention. 
Polypeptide compositions of the invention may further comprise an acceptable carrier, such as a 
hydrophilic, e.g., pharmaceutically acceptable, carrier. 

The invention also provides host cells transformed or transfected with a polynucleotide of 

5 the invention. 

The invention also relates to methods for producing a polypeptide of the invention 
comprising growing a culture of the host cells of the invention in a suitable culture medium 
under conditions permitting expression of the desired polypeptide, and purifying the polypeptide 
from the culture or from the host cells. Preferred embodiments include those in which the 

10 protein produced by such process is a mature form of the protein. 

Polynucleotides according to the invention have numerous applications in a variety of 
techniques known to those skilled in the art of molecular biology. These techniques include use 
as hybridization probes, use as oligomers, or primers, for PCR, use for chromosome and gene 
mapping, use in the recombinant production of protein, and use in generation of anti-sense DNA 

1 5 or RNA, their chemical analogs and the like. For example, when the expression of an mKN A is 
largely restricted to a particular cell or tissue type, polynucleotides of the invention can be used 
as hybridization probes to detect the presence of the particular cell or tissue mRNA in a sample 

using, e.g., in situ hybridization. 

In other exemplary embodiments, the polynucleotides are used in diagnostics as 
20 expressed sequence tags for identifying expressed genes or, as well known in the art and 

exemplified by Vollrath et al, Science 258:52-59 (1992), as expressed sequence tags for physical 

mapping of the human genome. 

The polypeptides according to the invention can be used in a variety of conventional 
procedures and methods that are currently applied to other proteins. For example, a polypeptide 
25 of the invention can be used to generate an antibody that specifically binds the polypeptide. Such 
antibodies, particularly monoclonal antibodies, are useful for detecting or quantitating the 
polypeptide in tissue. The polypeptides of the invention can also be used as molecular weight 

markers, and as a food supplement. 

Methods are also provided for preventing, treating, or ameliorating a medical condition 
30 which comprises the step of adrninistering to a mammalian subject a therapeutically effective 
amount of a composition comprising a polypeptide of the present invention and a 
pharmaceutically acceptable carrier. 

In particular, the polypeptides and polynucleotides of the invention can be utilized, for 
example, in methods for the prevention and/or treatment of disorders involving aberrant protein 
3 5 expression or biological activity. 

9 
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The present invention further relates to methods for detecting the presence of the 

polynucleotides or polypeptides of the invention in a sample. Such methods can, for example, be 
utilized as part of prognostic and diagnostic evaluation of disorders as recited herein and for the 
identification of subjects exhibiting a predisposition to such conditions. The invention provides 
a method for detecting the polynucleotides of the invention in a sample, comprising contacting 
the sample with a compound that binds to and forms a complex with the polynucleotide of 
interest for a period sufficient to form the complex and under conditions sufficient to form a 
complex and detecting the complex such that if a complex is detected, the polynucleotide of 
interest is detected. The invention also provides a method for detecting the polypeptides of the 
invention in a sample comprising contacting the sample with a compound that binds to and forms 
a complex with the polypeptide under conditions and for a period sufficient to form the complex 
and detecting the formation of the complex such that if a complex is formed, the polypeptide is 
detected. 

The invention also provides kits comprising polynucleotide probes and/or monoclonal 
antibodies, and optionally quantitative standards, for carrying out methods of the invention. 
Furthermore, the invention provides methods for evaluating the efficacy of drugs, and 
monitoring the progress of patients, involved in clinical trials for the treatment of disorders as 
recited above. 

The invention also provides methods for the identification of compounds that modulate 
(i.e., increase or decrease) the expression or activity of the polynucleotides and/or polypeptides 
of the invention. Such methods can be utilized, for example, for the identification of compounds 
that can ameliorate symptoms of disorders as recited herein. Such methods can include, but are 
not limited to, assays for identifying compounds and other substances that interact with (e.g., 
bind to) the polypeptides of the invention. The invention provides a method for identifying a 
compound that binds to the polypeptides of the invention comprising contacting the compound 
with a polypeptide of the invention in a cell for a time sufficient to form a polypeptide/compound 
complex, wherein the complex drives expression of a reporter gene sequence in the cell; and 
detecting the complex by detecting the reporter gene sequence expression such that if expression 
of the reporter gene is detected the compound the binds to a polypeptide of the invention is 
identified. 

The methods of the invention also provides methods for treatment which involve the 
administration of the polynucleotides or polypeptides of the invention to individuals exhibiting 
symptoms or tendencies. In addition, the invention encompasses methods for treating diseases or 
disorders as recited herein comprising administering compounds and other substances that 
modulate the overall activity of the target gene products. Compounds and other substances can 
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effect such modulation either on the level of target gene/protein expression or target protein 
activity. 

The polypeptides of the present invention and the polynucleotides encoding them are also 
useful for the same functions known to one of skill in the art as the polypeptides and 
polynucleotides to which they have homology (set forth in Table 2); for which they have a 
signature region (as set forth in Table 3); or for which they have homology to a gene family (as 
set forth in Table 4). If no homology is set forth for a sequence, then the polypeptides and 
polynucleotides of the present invention are useful Tor a variety of applications, as described 
herein, including use in arrays for detection. 

4. DETAILED DESCRIPTION OF THE INVENTION 



4.1 DEFINITIONS 

It must be noted that as used herein and in the appended claims, the singular forms "a", 
1 5 "an" and "the" include plural references unless the context clearly dictates otherwise. 

The term "active" refers to those forms of the polypeptide which retain the biologic 
and/or immunologic activities of any naturally occurring polypeptide. According to the 
invention, the terms "biologically active" or "biological activity" refer to a protein or peptide 
having structural, regulatory or biochemical functions of a naturally occurring molecule. 
20 Likewise "immunologically active" or "immunological activity" refers to the capability of the 
natural, recombinant or synthetic polypeptide to induce a specific immune response in 
appropriate animals or cells and to bind with specific antibodies. 

The term "activated cells" as used in this application are those cells which are engaged in 
extracellular or intracellular membrane trafficking, including the export of secretory or 
25 en2ymatic molecules as part of a normal or disease process. 

The terms "complementary" or "complementarity" refer to the natural binding of 
polynucleotides by base pairing. For example, the sequence 5'-AGT-3' binds to the 
complementary sequence 3'-TCA-5\ Complementarity between two single-stranded molecules 
may be "partial" such that only some of the nucleic acids bind or it may be "complete" such that 
30 total complementarity exists between the single stranded molecules. The degree of 

complementarity between the nucleic acid strands has significant effects on the efficiency and 
strength of the hybridization between the nucleic acid strands. 

The term "embryonic stem cells (ES)" refers to a cell that can give rise to many 
differentiated cell types in an embryo or an adult, including the germ cells. The term "germ line 
3 5 stem cells (GSCs)" refers to stem cells deri ved from primordial stem cells that provide a steady 
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and continuous source of germ cells for the production of gametes. The term "primordial germ 
cells (PGCs)" refers to a small population of cells set aside from other cell lineages particularly 
from the yolk sac, mesenteries, or gonadal ridges during embryogenesis that have the potential to 
differentiate into germ cells and other cells. PGCs are the source from which GSCs and ES cells 
are derived The PGCs, the GSCs and the ES cells are capable of self-renewal. Thus these cells 
not only populate the germ line and give rise to a plurality of terminally differentiated cells that 
comprise the adult specialized organs, but are able to regenerate themselves. 

The term "expression modulating fragment," EMF, means a series of nucleotides which 
modulates the expression of an operably linked ORF or another EMF. 

As used herein, a sequence is said to "modulate the expression of an operably linked 
sequence" when the expression of the sequence is altered by the presence of the EMF. EMFs 
include, but are not limited to, promoters, and promoter modulating sequences (inducible 
elements). One class of EMFs are nucleic acid fragments which induce the expression of an 
operably linked ORF in response to a specific regulatory factor or physiological event. 

The terms "nucleotide sequence" or "nucleic acid" or "polynucleotide" or 
"oligonculeotide" are used interchangeably and refer to a heteropolymer of nucleotides or the 
sequence of these nucleotides. These phrases also refer to DNA or RNA of genomic or synthetic 
origin which may be single-stranded or double-stranded and may represent the sense or the 
antisense strand, to peptide nucleic acid (PNA) or to any DNA-like or RNA-like material. In the 
sequences herein A is adenine, C is cytosine, T is thymine, G is guanine and N is A, C, G or T 
(U). It is contemplated that where the polynucleotide is RNA, the T (thymine) in the sequences 
provided herein is substituted with U (uracil). Generally, nucleic acid segments provided by this 
invention may be assembled from fragments of the genome and short oligonucleotide linkers, or 
from a series of oligonucleotides, or from individual nucleotides, to provide a synthetic nucleic 
acid which is capable of being expressed in a recombinant transcriptional unit comprising 
regulatory elements derived from a microbial or viral operon, or a eukaryotic gene. 

The terms "oligonucleotide fragment" or a "polynucleotide fragment", "portion," or 
"segment" or "probe" or "primer" are used interchangeably and refer to a sequence of nucleotide 
residues which are at least about 5 nucleotides, more preferably at least about 7 nucleotides, 
more preferably at least about 9 nucleotides, more preferably at least about 1 1 nucleotides and 
most preferably at least about 17 nucleotides. The fragment is preferably less than about 500 
nucleotides, preferably less than about 200 nucleotides, more preferably less than about 100 
nucleotides, more preferably less than about 50 nucleotides and most preferably less than 30 
nucleotides. Preferably the probe is from about 6 nucleotides to about 200 nucleotides, 
preferably from about 15 to about 50 nucleotides, more preferably from about 17 to 30 
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nucleotides and most preferably from about 20 to 25 nucleotides. Preferably the fragments can 
be used in polymerase chain reaction (PCR), various hybridization procedures or microarray 
procedures to identify or amplify identical or related parts of mRNA or DNA molecules. A 
fragment or segment may uniquely identify each polynucleotide sequence of the present 
invention. Preferably the fragment comprises a sequence substantially similar to any one of SEQ 
ID NOs: 1 -20. 

Probes may, for example, be used to deterrnine whether specific mRNA molecules are 
present in a cell or tissue or to isolate similar nucleic acid sequences from chromosomal DNA as 
described by Walsh et al. (Walsh, P.S. et al., 1992, PCR Methods Appl 1 241-250). They may 
be labeled by nick translation, Klenow fill-in reaction, PCR, or other methods well known in the 
art. Probes of the present invention, their preparation and/or labeling are elaborated in 
Sambrook, J. et al., 1989, Molecular Cloning: A Laboratory Manual, Cold Spring Harbor 
Laboratory, NY; or Ausubel, F.M. et al., 1989, Current Protocols in Molecular Biology, John 
Wiley & Sons, New York NY, both of which are incorporated herein by reference in their 
entirety. 

The nucleic acid sequences of the present invention also include the sequence 
information from the nucleic acid sequences of SEQ IDNO:l-1786 and 3573-5358. The 
sequence information can be a segment of any one of SEQ ID NO:l-1786 and 3573-5358 that 
uniquely identifies or represents the sequence information of that sequence of SEQ ID NO:l- 
1786 and 3573-5358. One such segment can be a twenty-mer nucleic acid sequence because the 
probability that a twenty-mer is fully matched in the human genome is 1 in 300. In the human 
genome, there are three billion base pairs in one set of chromosomes. Because 4 possible 
twenty-mers exist, there are 300 times more twenty-mers than there are base pairs in a set of 
human chromosomes. Using the same analysis, the probability for a seventeen-mer to be fully 
matched in the human genome is approximately 1 in 5. When these segments are used in arrays 
for expression studies, fifteen-mer segments can be used. The probability that the fifteen-mer is 
fully matched in the expressed sequences is also approximately one in five because expressed 
sequences comprise less than approximately 5% of the entire genome sequence. 

Similarly, when using sequence information for detecting a single mismatch, a segment can 
be a twenty-five mer. The probability that the twenty-five mer would appear in a human genome 
with a single mismatch is calculated by multiplying the probability for a full match (1 -4 ) times the 
increased probability for mismatch at each nucleotide position (3 x 25). The probability that an 
eighteen mer with a single mismatch can be detected in an array for expression studies is 
approximately one in five. The probability that a twenty-mer with a single mismatch can be 
detected in a human genome is approximately one in five. 
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The term "open reading frame," ORF, means a series of nucleotide triplets coding for 
amino acids without any termination codons and is a sequence translatable into protein. 

The terms "operably linked" or "operably associated" refer to functionally related nucleic 
acid sequences. For example, a promoter is operably associated or operably linked with a coding 
sequence if the promoter controls the transcription of the coding sequence. While operably 
linked nucleic acid sequences can be contiguous and in the same reading frame, certain genetic 
elements e.g. repressor genes are not contiguously linked to the coding sequence but still control 
transcription/translation of the coding sequence. 

The term "pluripotent" refers to the capability of a cell to differentiate into a number of 
differentiated cell types that are present in an adult organism. A pluripotent cell is restricted in its 
differentiation capability in comparison to a totipotent cell. 

The terms "polypeptide" or "peptide" or "amino acid sequence" refer to an oligopeptide, 
peptide, polypeptide or protein sequence or fragment thereof and to naturally occurring or 
synthetic molecules. A polypeptide "fragment," "portion," or "segment" is a stretch of amino 
1 5 acid residues of at least about 5 amino acids, preferably at least about 7 amino acids, more 
preferably at least about 9 amino acids and most preferably at least about 1 7 or more amino 
acids. The peptide preferably is not greater than about 200 amino acids, more preferably less 
than 150 amino acids and most preferably less than 100 amino acids. Preferably the peptide is 
from about 5 to about 200 amino acids. To be active, any polypeptide must have sufficient 
20 length to display biological and/or immunological activity. 

The term "naturally occurring polypeptide" refers to polypeptides produced by cells that 
have not been genetically engineered and specifically contemplates various polypeptides arising 
from post-translational modifications of the polypeptide including, but not limited to, acetylation, 
carboxylation, glycosylation, phosphorylation, lipidation and acylation. 

The term "translated protein coding portion" means a sequence which encodes for the full 
length protein which may include any leader sequence or any processing sequence. 

The term "mature protein coding sequence" means a sequence which encodes a peptide 
or protein without a signal or leader sequence. The "mature protein portion" means that portion 
of the protein which does not include a signal or leader sequence. The peptide may have been 
produced by processing in the cell which removes any leader/signal sequence. The mature 
protein portion may or may not include the initial methionine residue. The methionine residue 
may be removed from the protein during processing in the cell. The peptide may be produced 
synthetically or the protein may have been produced using a polynucleotide only encoding for 
the mature protein coding sequence. 
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The term "derivative" refers to polypeptides chemically modified by such techniques as 
ubiquitination, labeling (e.g., with radionuclides or various enzymes), covalent polymer 
attachment such as pegylation (derivatization with polyethylene glycol) and insertion or 
substitution by chemical synthesis of amino acids such as ornithine, which do not normally occur 

5 in human proteins. 

The term "variant"(or "analog") refers to any polypeptide differing from naturally 
occurring polypeptides by amino acid insertions, deletions, and substitutions, created using, e g., 
recombinant DNA techniques. Guidance in determining which amino acid residues may be 
replaced, added or deleted without abolishing activities of interest, may be found by comparing 
1 0 the sequence of the particular polypeptide with that of homologous peptides and minimizing the 
number of amino acid sequence changes made in regions of high homology (conserved regions) 
or by replacing amino acids with consensus sequence. 

Alternatively, recombinant variants encoding these same or similar polypeptides may be 
synthesized or selected by making use of the "redundancy" in the genetic code. Various codon 
1 5 substitutions, such as the silent changes which produce various restriction sites, may be 
introduced to optimize cloning into a plasmid or viral vector or expression in a particular 
prokaryotic or eukaryotic system. Mutations in the polynucleotide sequence may be reflected in 
the polypeptide or domains of other peptides added to the polypeptide to modify the properties of 
any part of the polypeptide, to change characteristics such as ligand-binding affinities, interchain 
20 affinities, or degradation/turnover rate. 

Preferably, amino acid "substitutions" are the result of replacing one amino acid with 
another amino acid having similar structural and/or chemical properties, i.e., conservative amino 
acid replacements. "Conservative" amino acid substitutions may be made on the basis of 
similarity in polarity, charge, solubility, hydrophobicity, hydrophilicity, and/or the amphipathic 
25 nature of the residues involved. For example, nonpolar (hydrophobic) amino acids include 
alanine, leucine, isoleucine, valine, proline, phenylalanine, tryptophan, and methionine; polar 
neutral amino acids include glycine, serine, threonine, cysteine, tyrosine, asparagine, and 
glutamine; positively charged (basic) amino acids include arginine, lysine, and histidine; and 
negatively charged (acidic) amino acids include aspartic acid and glutamic acid. "Insertions" or 
30 "deletions" are preferably in the range of about 1 to 20 amino acids, more preferably 1 to 10 

amino acids. The variation allowed may be experimentally determined by systematically making 
insertions, deletions, or substitutions of amino acids in a polypeptide molecule using 
recombinant DNA techniques and assaying the resulting recombinant variants for activity. 
Alternatively, where alteration of function is desired, insertions, deletions or 
35 non-conservative alterations can be engineered to produce altered polypeptides. Such alterations 
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can, for example, alter one or more of the biological functions or biochemical characteristics of 
the polypeptides of the invention. For example, such alterations may change polypeptide 
characteristics such as ligand-binding affinities, interchain affinities, or degradation/turnover 
rate. Further, such alterations can be selected so as to generate polypeptides that are better suited 
5 for expression, scale up and the like in the host cells chosen for expression. For example, 
cysteine residues can be deleted or substituted with another amino acid residue in order to 
eliminate disulfide bridges. 

The terms "purified" or "substantially purified" as used herein denotes that the indicated 
nucleic acid or polypeptide is present in the substantial absence of other biological 
10 macromolecules, e.g., polynucleotides, proteins, and the like. In one embodiment, the 

polynucleotide or polypeptide is purified such that it constitutes at least 95% by weight, more 
preferably at least 99% by weight, of the indicated biological macromolecules present (but water, 
buffers, and other small molecules, especially molecules having a molecular weight of less than 
1000 daltons, can be present). 

1 5 The term "isolated" as used herein refers to a nucleic acid or polypeptide separated from 

at least one other component (e.g., nucleic acid or polypeptide) present with the nucleic acid or 
polypeptide in its natural source. In one embodiment, the nucleic acid or polypeptide is found in 
the presence of (if anything) only a solvent, buffer, ion, or other component normally present in a 
solution of the same. The terms "isolated" and "purified" do not encompass nucleic acids or 

20 polypeptides present in their natural source. 

The term "recombinant," when used herein to refer to a polypeptide or protein, means 
that a polypeptide or protein is derived from recombinant (e.g., microbial, insect, or mammalian) 
expression systems. "Microbial" refers to recombinant polypeptides or proteins made in 
bacterial or fungal (e.g., yeast) expression systems. As a product, "recombinant microbial" 

25 defines a polypeptide or protein essentially free of native endogenous substances and 

unaccompanied by associated native glycosylation. Polypeptides or proteins expressed in most 
bacterial cultures, e.g., £. coli, will be free of glycosylation modifications; polypeptides or 
proteins expressed in yeast will have a glycosylation pattern in general different from those 
expressed in mammalian cells. 

30 The term "recombinant expression vehicle or vector" refers to a plasmid or phage or virus 

or vector, for expressing a polypeptide from a DNA (RNA) sequence. An expression vehicle can 
comprise a transcriptional unit comprising an assembly of (1) a genetic element or elements 
having a regulatory role in gene expression, for example, promoters or enhancers, (2) a structural 
or coding sequence which is transcribed into mRNA and translated into protein, and (3) 

35 appropriate transcription initiation and termination sequences. Structural units intended for use 
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in yeast or eukaryotic expression systems preferably include a leader sequence enabling 
extracellular secretion of translated protein by a host cell. Alternatively, where recombinant 
protein is expressed without a leader or transport sequence, it may include an amino terminal 
methionine residue. This residue may or may not be subsequently cleaved from the expressed 
5 recombinant protein to provide a final product. 

The term "recombinant expression system" means host cells which have stably integrated 
a recombinant transcriptional unit into chromosomal DNA or carry the recombinant 
transcriptional unit extrachromosomally. Recombinant expression systems as defined herein will 
express heterologous polypeptides or proteins upon induction of the regulatory-elements linked 
1 0 to the DN A segment or synthetic gene to be expressed. This term also means host cells which 
have stably integrated a recombinant genetic element or elements having a regulatory role in 
gene expression, for example, promoters or enhancers. Recombinant expression systems as 
defined herein will express polypeptides or proteins endogenous to the cell upon induction of the 
regulatory elements linked to the endogenous DNA segment or gene to be expressed. The cells 

1 5 can be prokaryotic or eukaryotic. 

The term "secreted" includes a protein that is transported across or through a membrane, 
including transport as a result of signal sequences in its amino acid sequence when it is expressed 
in a suitable host cell. "Secreted" proteins include without limitation proteins secreted wholly 
(e.g., soluble proteins) or partially (e.g., receptors) from the cell in which they are expressed. 

20 "Secreted" proteins also include without limitation proteins that are transported across the 
membrane of the endoplasmic reticulum. "Secreted" proteins are also intended to include 
proteins containing non-typical signal sequences (e.g. Interleukin-1 Beta, see Krasney, P.A. and 
Young, P.R. (1992) Cytokine 4(2): 134 -143) and factors released from damaged cells (e.g. 
Interleukin-1 Receptor Antagonist, see Arend, W.P. et. ai. (1998) Annu. Rev. Immunol. 

25 16:27-55) 

Where desired, an expression vector may be designed to contain a "signal or leader 
sequence" which will direct the polypeptide through the membrane of a cell. Such a sequence 
may be naturally present on the polypeptides of the present invention or provided from 
heterologous protein sources by recombinant DNA techniques. 

•JO The term "stringent" is used to refer to conditions that are commonly understood in the 

art as stringent. Stringent conditions can include highly stringent conditions (i.e., hybridization 
to filter-bound DNA in 0.5 M NaHP0 4 , 7% sodium dodecyl sulfate (SDS), 1 mM EDTA at 
65°C, and washing in 0.1X SSC/0.1% SDS at 68°C), and moderately stringent conditions (i.e., 
washing in 0.2X SSC/0.1% SDS at 42°C). Other exemplary hybridization conditions are 

3 5 described herein in the examples. 
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In instances of hybridization of deoxyoligonucleotides, additional exemplary stringent 
hybridization conditions include washing in 6X SSC/0.05% sodium pyrophosphate at 37°C (for 
14-base oligonucleotides), 48°C (for 17-base oligos), 55°C (for 20-base oligonucleotides), and 
60°C (for 23-base oligonucleotides). 
5 As used herein, "substantially equivalent" can refer both to nucleotide and amino acid 

sequences, for example a mutant sequence, that varies from a reference sequence by one or more 
substitutions, deletions, or additions, the net effect of which does not result in an adverse 
functional dissimilarity between the reference and subject sequences. Typically, such a 
substantially equivalent sequence varies from one of those listed herein by no more than about 
10 35% (/.<?., the number of individual residue substitutions, additions, and/or deletions in a 

substantially equivalent sequence, as compared to the corresponding reference sequence, divided 
by the total number of residues in the substantially equivalent sequence is about 0.35 or less). 
Such a sequence is said to have 65% sequence identity to the listed sequence. In one 
embodiment, a substantially equivalent, e.g., mutant, sequence of the invention varies from a 
1 5 listed sequence by no more than 30% (70% sequence identity); in a variation of this embodiment, 
by no more than 25% (75% sequence identity); and in a further variation of this embodiment, by 
no more than 20% (80% sequence identity) and in a ftirther variation of this embodiment, by no 
more than 10% (90% sequence identity) and in a further variation of this embodiment, by no 
more that 5% (95% sequence identity). Substantially equivalent, e.g., mutant, amino acid 
20 sequences according to the invention preferably have at least 80% sequence identity with a listed 
amino acid sequence, more preferably at least 90% sequence identity. Substantially equivalent 
nucleotide sequences of the invention can have lower percent sequence identities, taking into 
account, for example, the redundancy or degeneracy of the genetic code. Preferably, nucleotide 
sequence has at least about 65% identity, more preferably at least about 75% identity, and most 
25 preferably at least about 95% identity. For the purposes of the present invention, sequences 
having substantially equivalent biological activity and substantially equivalent expression 
characteristics are considered substantially equivalent. For the purposes of determining 
equivalence, truncation of the mature sequence (e.g. , via a mutation which creates a spurious 
stop codon) should be disregarded. Sequence identity may be determined, e.g., using the Jotun 
30 Hein method (Hein, J. (1990) Methods Enzymol. 1 83:626-645). Identity between sequences can 
also be determined by other methods known in the art, e.g. by varying hybridization conditions. 

The term "totipotent' 5 refers to the capability of a cell to differentiate into all of the cell 
types of an adult organism. 

The term "transformation" means introducing DNA into a suitable host cell so that the 
35 DNA is replicable, either as an extrachromosomal element, or by chromosomal integration. The 
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term "transfection" refers.to the taking up of an expression vector by a suitable host cell, whether 
or not any coding sequences are in fact expressed. The term "infection" refers to the introduction 
of nucleic acids into a suitable host cell by use of a virus or viral vector. 

As used herein, an "uptake modulating fragment," UMF, means a series of nucleotides 
5 which mediate the uptake of a linked DNA fragment into a cell. UMFs can be readily identified 
using known UMFs as a target sequence or target motif with the computer-based systems 
described below. The presence and activity of a UMF can be confirmed by attaching the 
suspected UMF to a marker sequence. The resulting nucleic acid molecule is then incubated 
with an appropriate host under appropriate conditions and the uptake of the marker sequence is 
0 determined. As described above, a UMF will increase the frequency of uptake of a linked 
marker sequence. 

Each of the above terms is meant to encompass all that is described for each, unless the 
context dictates otherwise. 

5 4.2 NUCLEIC ACIDS OF THE INVENTION 

Nucleotide sequences of the invention are set forth in the Sequence Listing. 
The isolated polynucleotides of the invention include a polynucleotide comprising the 
nucleotide sequences of SEQ ID NO:l-1786 and 3573-5358 ; a polynucleotide encoding any one 
of the peptide sequences of SEQ ID NO: 1787-3572 and 5359-7144; and a polynucleotide 
20 comprising the nucleotide sequence encoding the mature protein coding sequence of the 

polypeptides of any one of SEQ ID NO:.1787-3572 and 5359-7144. The polynucleotides of the 
present invention also include, but are not limited to, a polynucleotide that hybridizes under 
stringent conditions to (a) the complement of any of the nucleotides sequences of SEQ ID MO: 1 - 
1786 and 3573-5358 ; (b) nucleotide sequences encoding any one of the amino acid sequences 
25 set forth in the Sequence Listing; (c) a polynucleotide which is an allelic variant of any 

polynucleotide recited above; (d) a polynucleotide which encodes a species homolog of any of 
the proteins recited above; or (e) a polynucleotide that encodes a polypeptide comprising a 
specific domain or truncation of the polypeptides of SEQ ID NO: 1 787-3572 and 5359-7144. 
Domains of interest may depend on the nature of the encoded polypeptide; e.g., domains in 
30 receptor-like polypeptides include ligand-binding, extracellular, transmembrane, or cytoplasmic 
domains, or combinations thereof; domains in immunoglobulin-like proteins include the variable 
immunogiobuluvlike domains; domains in enzyme-like polypeptides include catalytic and 
substrate binding domains; and domains in ligand polypeptides include receptor-binding 
domains. 
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The polynucleotides of the invention include naturally occurring or wholly or partially 
synthetic DNA, e.g., cDNA and genomic DNA, and RNA, e.g., mRNA. The polynucleotides 
may include all of the coding region of the cDNA or may represent a portion of the coding 
region of the cDNA. 

5 The present invention also provides genes corresponding to the cDNA sequences disclosed 

herein. The corresponding genes can be isolated in accordance with known methods using the 
sequence information disclosed herein. Such methods include the preparation of probes or primers 
from the disclosed sequence information for identification and/or amplification of genes in 
appropriate genomic libraries or other sources of genomic materiais. Further 5' and 3' sequence can 

1 0 be obtained using methods known in the art. For example, full length cDNA or genomic DNA that 
corresponds to any of the polynucleotides of SEQ IDNO:l-1786and3573-5358 can be obtained 
by screening appropriate cDNA or genomic DNA libraries under suitable hybridization conditions 
using any of the polynucleotides of SEQ ID NO:l -1 786 and 3573-5358 or a portion thereof as a 
probe. Alternatively , the polynucleotides of SEQ ID NO: 1-1786 and 3573-5358 may be used as the 

15 basis.for suitable primer(s) that allow identification and/or amplification of genes in appropriate 
genomic DNA or cDNA libraries. 

The nucleic acid sequences of the invention can be assembled from ESTs and sequences ' 
(including cDNA and genomic sequences) obtained from one or more public databases, such as 
dbEST, gbpri, and UniGene. The EST sequences can provide identifying sequence information, 

20 representative fragment or segment information, or novel segment information for the full-length 
gene. 

The polynucleotides of the invention also provide polynucleotides including nucleotide 
sequences that are substantially equivalent to the polynucleotides recited above. Polynucleotides 
according to the invention can have, e.g., at least about 65%, at least about 70%, at least about 

25 75%, at least about 80%, more typically at least about 90%, and even more typically at least 
about 95%, sequence identity to a polynucleotide recited above. 

Included within the scope of the nucleic acid sequences of the invention are nucleic acid 
sequence fragments that hybridize under stringent conditions to any of the nucleotide sequences 
of SEQ ID NO: 1-1 786 and 3573-5358, or complements thereof, which fragment is greater than 

30 about 5 nucleotides, preferably 7 nucleotides, more preferably greater than 9 nucleotides and 

most preferably greater than 17 nucleotides. Fragments of, e.g. 15, 17, or 20 nucleotides or more 
that are selective for (i.e. specifically hybridize to any one of the polynucleotides of the 
invention) are contemplated. Probes capable of specifically hybridizing to a polynucleotide can 
differentiate polynucleotide sequences of the invention from other polynucleotide sequences in 
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theLXilY of genes or can difference human genes from genes of other species, and are 
preferably based on unique nucleotide sequences. 

The sequences falling within the scope of the present invention are not limited to these 
specificsequences.but also include allelic and species variations thereof. Allelicand species 
5 variationscan be routinely determinedby comparingthe sequence provided SEQ ID NO.1-1786 
and 3573-5358 a representative fragment thereof, or a nucleotide sequence at least 90% identical, 
preferably 95% identical, to SEQ ID NO:l-1786 and 3573-5358 with a sequence from another 
isolate of the same species. Furthermore, to accommodate codon variability, the invention includes 
nucleic acid molecules coding for the same amino acid sequences as do the specific ORFs disclosed 
, 0 herein. In other words, in the coding region of an ORF, substitution of one codon for another codon 
that encodes the same amino acid is expressly contemplated. 

The nearest neighbor or homology result for the nucleic acids of the present invention, 
including SEQ ID NO: 1 -1786 and 3573-5358, can be obtained by searching a. database using an 
algorithm or a program. Preferably, a BLAST which stands for Basic Local Alignment Search Tool 
, 5 is used to search for local sequence alignments (Altshul, S.F. J Mol. Evol. 36 290-300 (1993) and 
Altschul S.F. etal. J. Mol. Biol. 21 :403-410 (1 990)). Alternatively a FASTA version 3 search 

against Genpept, using Fastxy algorithm. 

Species homologs (or orthologs) of the disclosed polynucleotides and proteins axe also 
provided by the present invention. Species homologs may be iso.ated and identified by making 
20 suitable probes or primers from the sequences provided herein and screening a suitable nucleic 

acid source from the desired species. 

The invention also encompasses allelic variants of the disclosed polynucleotides or 
proteins; that is, naturally-occurring alternative forms of the isolated polynucleotide which also 
encode proteins which are identical, homologous or related to that encoded by the 

25 polynucleotides. 

The nucleic acid sequences of the invention are further directed to sequences which 
encode variants of the described nucleic acids. These amino acid sequence variants may be 
prepared by methods known in the art by introducing appropriate nucleotide changes into a 
native or variant polynucleotide. There are two variables in the construction of amino acid 
30 sequence variants: the location of the mutation and the nature of the mutation. Nucleic acids 
encoding the amino acid sequence variants are preferably constructed by mutating the 
polynucleotide to encode an ammo acid sequence that does not occur in nature. These nucleic 
acid alterations can be made at sites that differ in the nucleic acids from different species 
(variable positions) or in highly conserved regions (constant regions). Sites at such locations 
will typically be modified in series, e.g., by substituting first with conservative choices (e.g., 
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hydrophobic amino acid to a different hydrophobic amino acid) and then with more distant 

choices (e.g., hydrophobic amino acid to a charged amino acid), and then deletions or insertions 

may be made at the target site. Amino acid sequence deletions generally range from about 1 to 

30 residues, preferably about 1 to 10 residues, and are typically contiguous. Amino acid 

5 insertions include amino- and/or carboxyl-terminal fusions ranging in length from one to one 

hundred or more residues, as well as intrasequence insertions of single or multiple amino acid 

residues. Intrasequence insertions may range generally from about 1 to 10 amino residues, 

preferably from 1 to 5 residues. Examples of terminal insertions include the heterologous signal 

sequences necessary for secretion or for intracellular targeting in different host cells and 

10 sequences such as FLAG or poly-histidine sequences useful for purifying the expressed protein. 

In a preferred method, polynucleotides encoding the novel amino acid sequences are 
changed via site-directed mutagenesis. This method uses oligonucleotide sequences to alter a 
polynucleotide to encode the desired amino acid variant, as well as sufficient adjacent 
nucleotides on both sides of the changed amino acid to form a stable duplex on either side of the 

15 site of being changed. In general, the techniques of site-directed mutagenesis are well known to 
those of skill in the art and this technique is exemplified by publications such as, Edelman et al., 
DNA 2:183 (1983). A versatile and efficient method for producing site-specific changes in a 
polynucleotide sequence was published by Zoller and Smith, Nucleic Acids Res. 10:6487-6500 
(1982). PCR may also be used to create amino acid sequence variants of the novel nucleic acids. 

20 When small amounts of template DNA are used as starting material, primer(s) that differs 

slightly in sequence from the corresponding region in the template DNA can generate the desired 
amino acid variant. PCR amplification results in a population of product DNA fragments that 
differ from the polynucleotide template encoding the polypeptide at the position specified by the 
primer. The product DNA fragments replace the corresponding region in the plasmid and this 

25 gives a polynucleotide encoding the desired amino acid variant. 

A further technique for generating amino acid variants is the cassette mutagenesis 
technique described in Wells et al., Gene 34:315 (1985); and other mutagenesis techniques well 
known in the art, such as, for example, the techniques in Sambrook et a]., supra, and Current 
Protocols in Molecular Biology, Ausubel et al. Due to the inherent degeneracy of the genetic 

30 code, other DNA sequences which encode substantially the same or a functionally equivalent 

amino acid sequence may be used in the practice of the invention for the cloning and expression 
of these novel nucleic acids. Such DNA sequences include those which are capable of 
hybridizing to the appropriate novel nucleic acid sequence under stringent conditions. 
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Polynucleotides encoding preferred polypeptide truncations of the invention can be used 
to generate polynucleotides encoding chimeric or fusion proteins comprising one or more 
domains of the invention and heterologous protein sequences. 

The polynucleotides of the invention additionally include the complement of any of the 
polynucleotides recited above. The polynucleotide can be DNA (genomic, cDNA, amplified, or 
synthetic) or RNA. Methods and algorithms for obtaining such polynucleotides are well known 
to those of skill in the art and can include, for example, methods for determining hybridization 
conditions that can routinely isolate polynucleotides of the desired sequence identities. 

In accordance with the invention, polynucleotide sequences comprising the mature 
protein coding sequences corresponding to any one of SEQ ID NO:l-1786 and 3573-5358, or 
functional equivalents thereof, may be used to generate recombinant DNA molecules that direct 
the expression of that nucleic acid, or a functional equivalent thereof, in appropriate host cells. 
Also included are the cDNA inserts of any of the clones identified herein. 

A polynucleotide according to the invention can be joined to any of a variety of other 
nucleotide sequences by well-established recombinant DNA techniques (see Sambrook J et al. 
(1989) Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Laboratory, NY). Useful 
nucleotide sequences for joining to polynucleotides include an assortment of vectors, e.g., 
plasmids, cosmids, lambda phage derivatives, phagemids, and the like, that are well known in the 
art. Accordingly, the invention also provides a vector including a polynucleotide of the 
invention and a host cell containing the polynucleotide. In general, the vector contains an origin 
of replication functional in at least one organism, convenient restriction endonuclease sites, and a 
selectable marker for the host cell. Vectors according to the invention include expression 
vectors, replication vectors, probe generation vectors, and sequencing vectors. A host cell 
according to the invention can be a prokaryotic or eukary otic cell and can be a unicellular 
organism or part of a multicellular organism. 

The present invention further provides recombinant constructs comprising a nucleic acid 
having any of the nucleotide sequences of SEQ ID NO: 1-1786 and 3573-5358 or a fragment 
thereof or any other polynucleotides of the invention. In one embodiment, the recombinant 
constructs of the present invention comprise a vector, such as a plasmid or viral vector, into 
which a nucleic acid having any of the nucleotide sequences of SEQ ID NO:M 786 and 3573- 
5358 or a fragment thereof is inserted, in a forward or reverse orientation. In the case of a vector 
comprising one of the ORFs of the present invention, the vector may further comprise regulatory 
sequences, including for example, a promoter, operably linked to the ORF. Large numbers of 
suitable vectors and promoters are known to those of skill in the art and are commercially 
available for generating the recombinant constructs of the present invention. The following 
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vectors are provided by way of example. Bacterial: pBs, phagescript, PsiX174, pBluescript SK, 
pBs KS, pNH8a, pNH16a, pNH18a, pNH46a (Stratagene); P Trc99A, pKK223-3, pKK233-3, 
pDR540, pRJT5 (Pharmacia). Eukaryotic: p WLneo, pSV2cat, pOG44, PXTI, pSG (Stratagene) 
pSVK3, pBPV, pMSG, pSVL (Pharmacia). 
5 The isolated polynucleotide of the invention may be operably linked to an expression 

control sequence such as the pMT2 or pED expression vectors disclosed in Kaufman et al.. 
Nucleic Acids Res. 19, 4485-4490 (1991), in order to produce the protein recombinantly. Many 
suitable expression control sequences are known in the art. General methods of expressing 
recombinant proteins are also known and are exemplified in R. Kaufman, Methods in 

10 Enzymology 1 85, 537-566 (1990). As defined herein "operably linked" means that the isolated 
polynucleotide of the invention and an expression control sequence are situated within a vector 
or cell in such a way that the protein is expressed by a host cell which has been transformed 
(transfected) with the ligated polynucleotide/expression control sequence. 

Promoter regions can be selected from any desired gene using CAT (chloramphenicol 

1 5 transferase) vectors or other vectors with selectable markers. Two appropriate vectors are 
pKK232-8 and pCM7. Particular named bacterial promoters include lacl, lacZ, T3, T7, gpt, 
lambda PR, and trc. Eukaryotic promoters include CMV immediate early, HSV thymidine 
kinase, early and late SV40, LTRs from retrovirus, and mouse metallothionein-L Selection of 
the appropriate vector and promoter is well within the level of ordinary skill in the art. 

20 Generally, recombinant expression vectors will include origins of replication and selectable 

markers permitting transformation of the host cell, e.g., the ampicillin resistance gene of E. coli 
and S. cerevisiae TRP1 gene, and a promoter derived from a highly- expressed gene to direct 
transcription of a downstream structural sequence. Such promoters can be derived from operons 
encoding glycolytic enzymes such as 3 -phosphogly cerate kinase (PGK), a- factor, acid 

25 phosphatase, or heat shock proteins, among others. The heterologous structural sequence is 
assembled in appropriate phase with translation initiation and termination sequences, and 
preferably, a leader sequence capable of directing secretion of translated protein into the 
periplasmic space or extracellular medium. Optionally, the heterologous sequence can encode a 
fusion protein including an amino terminal identification peptide imparting desired 

30 characteristics, e.g., stabilization or simplified purification of expressed recombinant product. 
Useful expression vectors for bacterial use are constructed by inserting a structural DNA 
sequence encoding a desired protein together with suitable translation initiation and termination 
signals in operable reading phase with a functional promoter. The vector will comprise one or 
more phenotypic selectable markers and an origin of replication to ensure maintenance of the 

35 vector and to, if desirable, provide amplification within the host. Suitable prokaryotic hosts for 
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transformation include E. coli, Bacillus subtilis, Salmonella typhirmtrium and varums species 
within the genera Pseudomonas, Streptomyces, and Staphylococcus, although others may also be 

employed as a matter of choice. 

As a representative but non-limiting example, useful expression vectors for bacterial use 
5 can comprise a selectable marker and bacterial origin of replication derived from commercially 
available plasmids comprising genetic elements of the well known cloning vector P BR322 
(ATCC 37017). Such commercial vectors include, for example, pKK223-3 (Pharmacia Fine 
Chemicals, Uppsala, Sweden) and GEM 1 (Promega Biotech, Madison, WI, USA). These 
P BR322 "backbone" sections are combined with an appropriate promoter and the structural 
10 sequence to be expressed. Following transformation of a suitable host strain and growth of the 
host strain to an appropriate cell density, the selected promoter is induced or depressed by 
appropriate means (e.g., temperature shift or chemical induction) and cells are cultured for an 
additional period. Cells are typically harvested by centrifugation, disrupted by physical or 
chemical means, and the resulting crude extract retained for further purification. 
1 5 Polynucleotides of the invention can also be used to induce immune responses. For 

example, as described in Fan et aL. Nat. Biotech. 17:870-872 (1999), incorporated herein by 
reference, nucleic acid sequences encoding a polypeptide may be used to generate anybodies 
against the encoded polypeptide following topical administration of naked plasmid DNA or 
following injection, and preferably intramuscular injection of the DNA. The nucleic acid 
20 sequences are preferably inserted in a recombinant expression vector and may be in the form of 
naked DNA. 

4.3 ANTISENSE 

Another aspect of the invention pertains to isolated anusense nucleic acid molecules that 
25 are hybridizable to or complementary to the nucleic acid molecule comprising the nucleotide 
sequence of SEQ IDNO:l-1786 and 3573-5358, or fragments, analogs or derivatives thereof. 
An "anusense" nucleic acid comprises a nucleotide sequence that is complementary to a "sense" 
nucleic acid encoding a protein, e.g. , complementary to the coding strand of a double-stranded 
cDNA molecule or complementary to an mRNA sequence. In specific aspects, ant.sense nuclerc 
30 acid molecules are provided that comprise a sequence complementary to at least about 10, 25, 
50, 100, 250 or 500 nucleotides or an entire coding strand, or to only a portion thereof. Nucletc 
acid molecules encoding fragments, homologs, derivatives and analogs of a protein of any of 
SEQ ID NO: 1787-3572 and 5359-7144 or antisense nucleic acids complementary to a nuclei 
acid sequence of SEQ ID NO: 1-1 786 and 3573-5358 are additionally provided. 



20 



ENSDOCID: <WO 



0153312A1J. 



W °«"*»'2 PCT/US0D/342r,3 

In one embodiment, an antisense nucleic acid molecule is antisense to a "coding region" 
of the coding strand of a nucleotide sequence of the invention. The term "coding region" refers 
to the region of the nucleotide sequence comprising codons which are translated into amino acid 
residues. In another embodiment, the antisense nucleic acid molecule is antisense to a 
"noncoding region" of the coding strand of a nucleotide sequence of the invention. The term 
"noncoding region" refers to 5' and 3' sequences which flank the coding region that are not 
translated into amino acids (i.e., also referred to as 5' and 3' untranslated regions). 

. Given the coding strand sequences encoding a nucleic acid disclosed herein (e.g., SEQ ID 
NO:l-1786 and 3573-5358 , antisense nucleic acids of the invention can be designed according 
to the rules of Watson and Crick or Hoogsteen base pairing. The antisense nucleic acid molecule 
can be complementary to the entire coding region of a mRNA, but more preferably is an 
oligonucleotide that is antisense to only a portion of the coding or noncoding region of a mRNA. 
For example, the antisense oligonucleotide can be complementary to the region surrounding the 
translation start site of a mRNA. An antisense oligonucleotide can be, for example, about 5, 10, 
15, 20, 25, 30, 35, 40, 45 or 50 nucleotides in length. An antisense nucleic acid of the invention 
can be constructed using chemical synthesis or enzymatic ligation reactions using procedures 
known in the art. For example, an antisense nucleic acid (e.g., an antisense oligonucleotide) can 
be chemically synthesized using naturally occurring nucleotides or variously modified 
nucleotides designed to increase the biological stability of the molecules or to increase the 
physical stability of the duplex formed between the antisense and sense nucleic acids, e.g., 
phosphorothioate derivatives and acridine substituted nucleotides can be used. 

Examples of modified nucleotides that can be used to generate the antisense nucleic acid 
include: 5-fluorouracil, 5-bromouracil, 5-chlorouracil, 5-iodouracil, hypoxanthine, xanthine, 
4-acetylcytosine, 5-(carboxyhydroxylmethyI) uracil, 5-carboxymethyIaminomethyl- 
2-thiouridine, 5-carboxymethylaminomethyluracil, dihydrouracil, beta-D-galactosyiqueosine, 
inosine, N6-isopentenyladenine, 1 -methylguanine, 1-methylinosine, 2,2-dimethylguanine, 
2-methyladenine, 2-methylguanine, 3-methylcytosine, 5-methylcytosine, N6-adenine, 
7-methylguanine, 5-methylaminomethyluracil, 5-methoxyaminomethyl-2-thiouracil, 
beta-D-mannosylqueosine, 5'-methoxycarboxymethyluracil, 5-methoxyuracil, 

2- methylthio-N6-isopentenyIadenine, uracil-5-oxyacetic acid (v), wybutoxosine, pseudouracil, 
queosine, 2-thiocytosine, 5-methyl-2-thiouracil, 2-thiouracil, 4-thiouracil, 5-methyluracil, 
uracil-5-oxyacetic acid methylester, uracil-5^oxyacetic acid (v), 5-methyl-2-thiouracil, 

3- (3-amino-3-N-2-carboxypropyl) uracil, (acp3)w, and 2,6-diaminopurine. Alternatively, the 
antisense nucleic acid can be produced biologically using an expression vector into which a 
nucleic acid has been subcloned in an antisense orientation (i.e., RNA transcribed from the 
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JlTnuZc acid will be of an antisense orientation to a target nucleic acid of interest, 

described further in the following subsection). 

The antisense nucleic acid molecules of the invention are typically administered to a 
subject or generated in situ such that they hybridize with or bind to cellular mRNA and/or 
.enomic DNA encoding a protein according to the invention to thereby inhibit expression of the 
protein, e.g., by inhibiting transcription and/or translation. The hybridization can be by 
conventional nucleotide complementarity to form a stable duplex, or, for example, m the case of 
an antisense nucleic acid molecule that binds to DNA du pl exes, through specific interacts m 
the major groove of the double helix. An example of a route of administration of anUsense 
, nucleic acid molecules of the invention includes direct injection at a tissue site. ^tentatively 
antisense nucleic acid molecules can be modified to target selected cells and then administered 
systemically. For example, for systemic administration, antisense molecules can be modvfied 
such that they specifically bind to receptors or antigens expressed on a se.ected cell surface 
by linking the antisense nucleic acid molecules to peptides or antibodies that bind to cell surface 
5 receptors or antigen, The anUsense nucle.c acd molecules can also be delivered to cells ustng 
the vectors described herein. To achieve sufficient intracellular concentrations of anUsense 
m olecules, vector constructs in which the anUsense nucleic acid molecule is placed under the 
control ofa strong pol II or pol III promoter are preferred. , 

lo yet another embodiment, the antisense nucleic acid molecule of the invent.on is an 
,0 a-anomeric nucleic acid molecule. An a-anomeric nucleic acid molecule forms specfic 

double-stranded hybnds with complementary RNA in which, contrary to the usua. P -units, the 
strands run parallel to each other (Gaultier et al. (1987) Nucleic Acids Res 15: 6625-6641)^ The 
antisense nucleic acid molecule can also comprise a 2'-o-methylribonuc.eotide (Inoue et a 
(1987) Nucleic Acids Res 15: 6131-6148) or a chimeric RNA -DNA analogue (Inoue et al. (1987) 
25 FEBSLett 215: 327-330). 

4 4 RIBOZYMES AND PNA MOIETIES 

In still another embodiment, an antisense nucleic acid of the invention is a ribozyme. 
Ribozymes are catalytic RNA molecules with ribonuclease activity that are capable of cleavmg a 

,0 single-stranded nucleic acid, such as a mRNA, to which they have a complementary region. 
Thus, ribozymes (e.g., hammerhead ribozymes (described in Haselhoff and Gerlach (1988) 
Nature 334 585-591)) can be used to catalyticaliy cleave a mRNA transcripts to thereby mhib.t 
translation of a mRNA. A ribozyme having specificity for a nucleic acid of the invention can be 
designed based upon the nucleotide sequence of a DNA disclosed herein (i.e., SEQ ID NO.1- 

35 1786 and 3573-5358). For example, a derivative of a Tetrahymena L-19 IVS RNA can be 
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constructed in which the nucleotide sequence of the active site is complementary to the 
nucleotide sequence to be cleaved in a SECX-encoding mRNA. See, e.g., Cech et al. U.S. Pat. 
No. 4,987,071 ; and Cech et al U.S. Pat. No. 5,1 1 6,742. Alternatively, SECX mRNA can be 
used to select a catalytic RNA having a specific ribonuclease activity from a pool of RNA 
5 molecules. See, e.g. , Bartel et al, (1993) Science 261 .141 1 -1 41 S. 

Alternatively, gene expression can be inhibited by targeting nucleotide sequences 
complementary to the regulatory region (e.g., promoter and/or enhancers) to form triple helical 
structures that prevent transcription of the gene in target cells. See generally, Helene. (1991) 
Anticancer Drug Des. 6: 569-84; Helene. et al. (1 992) Ann. N. Y. Acad Set 660:27-36; and 
1 0 Maher ( 1 992) Bioassays 1 4: 807-1 5. 

In various embodiments, the nucleic acids of the invention can be modified at the base 
moiety, sugar moiety or phosphate backbone to improve, e.g., the stability, hybridization, or 
solubility of the molecule. For example, the deoxyribose phosphate backbone of the nucleic 
acids can be modified to generate peptide nucleic acids (see Hyrup et al (1996) BioorgMed 
Chem 4: 5-23). As used herein, the terms "peptide nucleic acids" or "PNAs" refer to nucleic acid 
mimics, e.g., DNA mimics, in which the deoxyribose phosphate backbone is replaced by a 
pseudopeptide backbone and only the four natural nucleobases are retained. The neutral 
backbone of PNAs has been shown to allow for specific hybridization to DNA and RNA under 
conditions of low ionic strength. The synthesis of PNA oligomers can be performed using 
standard solid phase peptide synthesis protocols as described in Hyrup et al. (1996) above; 
Perry-O'Keefe et al. ( 1 996) PNAS 93 : 1 4670-675 . 

PNAs of the invention can be used in therapeutic and diagnostic applications. For 
example, PNAs can be used as antisense or antigene agents for sequence-specific modulation of 
gene expression by, e.g., inducing transcription or translation arrest or inhibiting replication. 
PNAs of the invention can also be used, e.g., in the analysis of single base pair mutations in a 
gene by, e.g., PNA directed PCR clamping; as artificial restriction enzymes when used in 
combination with other enzymes, e.g., SI nucleases (Hyrup B. (1996) above); or as probes or 

primers for DNA sequence and hybridization (Hyrup et al (1996), above; Perry-O'Keefe (1996), 
above). 

30 In another embodiment, PNAs of the invention can be modified, e.g. , to enhance their 

stability or cellular uptake, by attaching lipophilic or other helper groups to PNA, by the 
formation of PNA -DNA chimeras, or by the use of liposomes or other techniques of drug 
delivery known in the art. For example, PNA-DNA chimeras can be generated that may 
combine the advantageous properties of PNA and DNA. Such chimeras allow DNA recognition 

35 enzymes, e.g. , RNase H and DNA polymerases, to interact with the DNA portion while the PNA 
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portion would provide high binding affinity and specificity. PNA-DNA chimeras can be hnked 
using linkers of appropriate lengths selected in terms of base stacking, number of bonds between 
the nucleobases, and orientation (Hyrup (1996) above). The synthesis of PNA-DNA chimeras 
can be performed as described in Hyrup (1996) above and Finn et al. (1996) Nucl Acids Res 24: 
3357-63. For example, a DNA chain can be synthesized on a solid support using standard 
phosphoramidite coupling chemistry, and modified nucleoside analogs, e.g., 
5'-(4-methoxytrityl)amino-5'-deoxy-thymidine phosphoramidite, can be used between the PNA 
and the 5' end of DNA (Mag et al. (1989) Nucl Acid Res 17: 5973-88). PNA monomers are then 
coupled in a stepwise manner to produce a chimeric molecule with a 5' PNA segment and a 3' 
DNA segment (Finn et al. (1996) above). Alternatively, chimeric molecules can be synthesized 
with a 5' DNA segment and a 3' PNA segment. See, Petersen et al. (1 975) Bioorg Med Chem 
LettS: 1119-11124. 

In other embodiments, the oligonucleotide may include other appended groups such as 
peptides (e g., for targeting host cell receptors in vivo), or agents facilitating transport across the 

> cell membrane (see, e.g., Letsinger et al., 1989, Proc. Natl. Acad. Sci. U.S.A. 86:6553-6556; 
Lemaitre et al., 1987, Proc. Natl. Acad. Sci. 84:648-652; PCT Publication No. W088/09810) or 
the blood-brain barrier (see, e.g., PCT Publication No. W089/10134). In addition, 
oligonucleotides can be modified with hybridization triggered cleavage agents (See, e.g., Krol et 
al, 1988, BioTechniques 6:958-976) or intercalating agents. (See, e.g., Zon, 1 988, Pharm. Res. 

0 5 : 539-549). To this end, the oligonucleotide may be conjugated to another molecule, e.g., a. 

peptide, a hybridization triggered cross-linking agent, a transport agent, a hybridization-triggered 
cleavage agent, etc. 



4.5 HOSTS 

25 The present invention further provides host cells genetically engineered to contain the 

polynucleotides of the invention. For example, such host cells may contain nucleic acids of the 
invention introduced into the host cell using known transformation, transection or infection 
methods. The present invention still further provides host cells genetically engineered to express 
the polynucleotides of the invention, wherein such polynucleotides are in operative association 

30 with a regulatory sequence heterologous to the host cell which drives expression of the 

polynucleotides in the cell. 

Knowledge of nucleic acid sequences allows for modification of cells to permit, or 
increase, expression of endogenous polypeptide. Cells can be modified (e.g., by homologous 
recombination) to provide increased polypeptide expression by replacing, in whole or in part, the 
35 naturally occurring promoter with all or part of a heterologous promoter so that the cells express 
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the polypeptide at higher levels. The heterologous promoter is inserted in such a manner that it 
is operatively linked to the encoding sequences. See, for example, PCT International Publication 
No. WO94/12650, PCT International Publication No. WO92/20808, and PCT International 
Publication No. W09 1/09955. It is also contemplated that, in addition to heterologous promoter 
DNA, amplifiable marker DNA (e.g., ada, dhfr, and the multifunctional CAD gene which 
encodes carbamyl phosphate synthase, aspartate transcarbamylase, and dihydroorotase) and/or 
intron DNA may be inserted along with the heterologous promoter DNA. If linked to die coding 
sequence, amplification of the marker DNA by standard selection methods results in co- 
amplification of the desired protein coding sequences in the cells. 

The host cell can be a higher eukaryotic host cell, such as a mammalian cell, a lower 
eukaryotic host cell, such as a yeast cell, or the host cell can be a prokaryotic cell, such as a 
bacterial cell. Introduction of the recombinant construct into the host cell can be effected by 
calcium phosphate transfection, DEAE, dextran mediated transfection, or electroporation (Davis, 
L. et ah, Basic Methods in Molecular Biology- (1 986)). The host cells containing one of the 
polynucleotides of the invention, can be used in conventional manners to produce the gene 
product encoded by the isolated fragment (in the case of an ORF) or can be used to produce a 
heterologous protein under the control of the EMF. 

Any host/vector system can be used to express one or more of the ORFs of the present 
invention. These include, but are not limited to, eukaryotic hosts such as HeLa cells, Cv-1 cell, 
COS cells, 293 cells, and Sf9 cells, as well as prokaryotic host such as E. coli and B. subtilis. 
The most preferred cells arc those which do not normally express the particular polypeptide or 
protein or which expresses the polypeptide or protein at low natural level. Mature proteins can 
be expressed in mammalian cells, yeast, bacteria, or other cells under the control of appropriate 
promoters. Cell-free translation systems can also be employed to produce such proteins using 
RNAs derived from the DNA constructs of the present invention. Appropriate cloning and 
expression vectors for use with prokaryotic and eukaryotic hosts are described by Sambrook, ct 
al., in Molecular Cloning: A Laboratory Manual, Second Edition, Cold Spring Harbor, New 
York (1989), the disclosure of which is hereby incorporated by reference. 

Various mammalian cell culture systems can also be employed to express recombinant 
protein. Examples of mammalian expression systems include the COS-7 lines of monkey kidney 
fibroblasts, described by Gluzman, Cell 23 :175 (1981). Other cell lines capable of expressing a 
compatible vector are, for example, the CI 27, monkey COS cells, Chinese Hamster Ovary 
(CHO) cells, human kidney 293 cells, human epidermal A431 cells, human Colo205 cells, 3T3 
cells, CV-1 cells, other transformed primate cell lines, normal diploid cells, cell strains derived 
from in vitro culture of primary tissue, primary explants, HeLa cells, mouse L cells, BHK, 
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HL-60, U937, HaK or Jurkat cells. Mammalian expression vectors will comprise an origin of 
replication, a suitable promoter and also any necessary ribosome binding sites, polyadenylation 
site, splice donor and acceptor sites, transcriptional termination sequences, and 5' flanking 
nontranscribed sequences. DNA sequences derived from the SV40 viral genome, for example, 
5 SV40 origin, early promoter, enhancer, splice, and polyadenylation sites may be used to provide 
the required nontranscribed genetic elements. Recombinant polypeptides and proteins produced 
in bacterial culture are usually isolated by initial extraction from cell pellets, followed by one or 
more salting-out, aqueous ion exchange or size exclusion chromatography steps. Protein 
refolding steps can be used, as necessary, in completing configuration of the mature protein. 
1 0 Finally, high performance liquid chromatography (HPLC) can be employed for final purification 
steps. Microbial cells employed in expression of proteins can be disrupted by any convenient 
method, including freeze-thaw cycling, sonication, mechanical disruption, or use of cell lysing 
agents. 

Alternatively, it may be possible to produce the protein in lower eukaryotes such as yeast 
1 5 or insects or in prokaryotes such as bacteria. Potentially suitable yeast strains include 

Saccharomyces cerevisiae, Schizosaccharomyces pombe, Kluyveromyces strains, Candida, or 
any yeast strain capable of expressing heterologous proteins. Potentially suitable bacterial 
strains include Escherichia coli, Bacillus subtilis, Salmonella typhimurium, or any bacterial 
strain capable of expressing heterologous proteins. If the protein is made in yeast or bacteria, it 
20 may be necessary to modify the protein produced therein, for example by phosphorylation or 
glycosylate of the appropriate sites, in order to obtain the functional protein. Such covalent 
attachments may be accomplished using known chemical or enzymatic methods. 

In another embodiment of the present invention, cells and tissues may be engineered to 
express an endogenous gene comprising the polynucleotides of the invention under the control of 
25 inducible regulatory elements, in which case the regulatory sequences of the endogenous gene 

may be replaced by homologous recombination. As described herein, gene targeting can be used 
to replace a gene's existing regulatory region with a regulatory sequence isolated from a different 
gene or a novel regulatory sequence synthesized by genetic engineering methods. Such 
regulatory sequences may be comprised of promoters, enhancers, scaffold-attachment regions, 
30 negative regulatory elements, transcriptional initiation sites, regulatory protein binding sites or 
combinations of said sequences. Alternatively, sequences which affect the structure or stability 
of the RNA or protein produced may be replaced, removed, added, or otherwise modified by 
targeting. These sequence include polyadenylation signals, mRNA stability elements, splice 
sites, leader sequences for enhancing or modifying transport or secretion properties of the 
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protein, or other sequences which alter or improve the function or stability of protein or RNA 
molecules. 

The targeting event may be a simple insertion of the regulatory sequence, placing the 
gene under the control of the new regulatory sequence, e.g., inserting a new promoter or 
5 enhancer or both upstream of a gene. Alternatively, the targeting event may be a simple deletion 
of a regulatory element, such as the deletion of a tissue-specific negative regulatory element. 
Alternatively, the targeting event may replace an existing element; for example, a tissue-specific 
enhancer can be replaced by an enhancer that has broader or different cell-type specificity than 
the naturally occurring elements. Here, the naturally occurring sequences are deleted and new 

1 0 sequences are added. In all cases, the identification of the targeting event may be facilitated by 
the use of one or more selectable marker genes that are contiguous with the targeting DNA, 
allowing for the selection of cells in which the exogenous DNA has integrated into the host cell 
genome. The identification of the targeting event may also be facilitated by the use of one or 
more marker genes exhibiting the property of negative selection, such that the negatively 

1 5 selectable marker is linked to the exogenous DNA, but configured such that the negatively 
selectable marker flanks the targeting sequence, and such that a correct homologous 
recombination event with sequences in the host cell genome does not result in the stable 
integration of the negatively selectable marker. Markers useful for this purpose include the 
Herpes Simplex Virus thymidine kinase (TK) gene or the bacterial xanthine-guanine 

20 phosphoribosyl-transferasc (gpt) gene. 

The gene targeting or gene activation techniques which can be used in accordance with 
this aspect of the invention are more particularly described in U.S. Patent No. 5,272,071 to 
Chappel; U.S. Patent No. 5,578,461 to Sherwin et a!.; International Application No. 
PCT/US92/09627 (WO93/09222) by SeJden et al; and International Application No. 

25 PCT/US90/06436 (W09 1/06667) by Skoultchi et al., each of which is incorporated by reference 
herein in its entirety. 

4.6 POLYPEPTIDES OE THE INVENTION 

The isolated polypeptides of the invention include, but are not limited to, a polypeptide 
30 comprising: the amino acid sequences set forth as any one of SEQ ID NO: 1 787-3572 and 5359- 
7144 or an amino acid sequence encoded by any one of the nucleotide sequences SEQ ID NO:l- 
1 786 and 3573-5358 or the corresponding full length or mature protein. Polypeptides of the 
invention also include polypeptides preferably with biological or immunological activity that are 
encoded by: (a) a polynucleotide having any one of the nucleotide sequences set forth in SEQ ID 
35 NO:l-1786 and 3573-5358 or (b) polynucleotides encoding any one of the amino acid sequences 
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set forth as SEQ ID NO: 1787-3572 and 5359-7144 or (c) polynucleotides that hybridize to the 
complement of the polynucleotides of either (a) or (b) under stringent hybridization conditions. 
The invention also provides biologically active or immunologically active variants of any of the 
amino acid sequences set forth as SEQ ID NO.1787-3572 and 5359-7144 or the corresponding 
5 full length or mature protein; and "substantial equivalents" thereof (e.g., with at least about 
65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least 
about 90%, typically at least about 95%, more typically at least about 98%, or most typically at 
least about 99% amino acid identity) that retain biological activity. Polypeptides encoded by 
allelic variants may have a similar, increased, or decreased activity compared to polypeptides 
10 comprising SEQ ID NO: 1787-3572 and 5359-7144. 

Fragments of the proteins of the present invention which are capable of exhibiting 
biological activity are also encompassed by the present invention. Fragments of the protein may 
be in linear form or they may be cyclized using known methods, for example, as described in H. 
U. Saragovi, et al., Bio/Technology 10, 773-778 (1992) and in R. S. McDowell, et a!., J. Amer. 
15 Chcm. Soc. 1 14, 9245-9253 (1992), both of which are incorporated herein by reference. Such 
fragments may be fused to carrier molecules such as immunoglobulins for many purposes, 
including increasing the valency of protein binding sites. 

The present invention also provides both full-length and mature forms (for example, 
without a signal sequence or precursor sequence) of the disclosed proteins. The protein coding 
20 sequence is identified in the sequence listing by translation of the disclosed nucleotide 

sequences. The mature form of such protein may be obtained by expression of a full-length 
polynucleotide in a suitable mammalian cell or other host cell. The sequence of the mature form 
of the protein is also determinable from the amino acid sequence of the full-length form. Where 
proteins of the present invention are membrane bound, soluble forms of the proteins are also 
25 provided. In such forms, part or all of the regions causing the proteins to be membrane bound 
are deleted so that the proteins are folly secreted from the cell in which they are expressed. 

Protein compositions of the present invention may further comprise an acceptable carrier, 
such as a hydrophilic, e.g., pharmaceutical^ acceptable, carrier. 

The present invention further provides isolated polypeptides encoded by the nucleic acid 
30 fragments of the present invention or by degenerate variants of the nucleic acid fragments of the 
present invention. By "degenerate variant" is intended nucleotide fragments which differ from a 
nucleic acid fragment of the present invention {e.g., an ORF) by nucleotide sequence but, due to 
the degeneracy of the genetic code, encode an identical polypeptide sequence. Preferred nucleic 
acid fragments of the present invention are the ORFs that encode proteins. 
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A variety of methodologies known in the ait can be utilized to obtain any one of the 
isolated polypeptides or proteins of the present invention. At the simplest level, the amino acid 
sequence can be synthesized using commercially available peptide synthesizers. The 
synthetically-constructed protein sequences, by virtue of sharing primary, secondary or tertiary 
5 structural and/or conformational characteristics with proteins may possess biological properties 
in common therewith, including protein activity. This technique is particularly useful in 
producing small peptides and fragments of larger polypeptides. Fragments are useful, for 
example, in generating antibodies against the native polypeptide. Thus, they may be employed 
as biologically active or immunological substitutes for natural, purified proteins in screening of 

10 therapeutic compounds and in immunological processes for the development of antibodies. 

The polypeptides and proteins of the present invention can alternatively be purified from 
cells which have been altered to express the desired polypeptide or protein. As used herein, a 
cell is said to be altered to express a desired polypeptide or protein when the ceil, through genetic 
manipulation, is made to produce a polypeptide or protein which it normally does not produce or 

15 which the cell normally produces at a lower level. One skilled in the art can readily adapt 
procedures for introducing and expressing either recombinant or synthetic sequences into 
eukaryotic or prokaryotic cells in order to generate a cell which produces one of the polypeptides 
or proteins of the present invention. 

The invention also relates to methods for producing a polypeptide comprising growing a 

20 culture of host cells of the invention in a suitable culture medium, and purifying the protein from 
the cells or the culture in which the cells are grown. For example, the methods of the invention 
include a process for producing a polypeptide in which a host cell containing a suitable 
expression vector that includes a polynucleotide of the invention is cultured under conditions that 
allow expression of the encoded polypeptide. The polypeptide can be recovered from the 

25 culture, conveniently from the culture medium, or from a lysate prepared from the host cells and 
further purified. Preferred embodiments include those in which the protein produced by such 
process is a full length or mature form of the protein. 

In an alternative method, the polypeptide or protein is purified from bacterial cells which 
naturally produce the polypeptide or protein. One skilled in the art can readily follow known 

30 methods for isolating polypeptides and proteins in order to obtain one of the isolated 
polypeptides or proteins of the present invention. These include, but are not limited to, 
irnmunochroniatography, HPLC, size-exclusion chromatography, ion-exchange chromatography, 
and irnmuno-affinity chromatography. See, e.g., Scopes, Protein Purification: Principles and 
Practice, Springer-Verlag (1994); Sambrook, et al., in Molecular Cloning: A Laboratory 

35 Manual; Ausubel et al., Current Protocols in Molecular Biology. Polypeptide fragments that 
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retain biological/immunological activity include fragments comprising greater than about 100 

amino acids, or greater than about 200 arnino acids, and fragments that encode specific protein 

- . 

domains. 

The purified polypeptides can be used in in vitro binding assays which are well known in 
5 the art to identify molecules^ which bind to the polypeptides. These molecules include but are not 
limited to, for e.g., small molecules, molecules from combinatorial libraries, antibodies or other 
proteins. The molecules identified in the binding assay are then tested for antagonist or agonist 
activity in in vivo tissue culture or animal models that are well known in the art. In brief, the 
molecules are titrated into a plurality of cell cultures or animals and then tested for either 
10 cell/animal death or prolonged survival of the animal/cells. 

In addition, the peptides of the invention or molecules capable of binding to the peptides 
may be complexcd with toxins, e.g., ricin or cholera, or with other compounds that are toxic to 
cells. The toxin-binding molecule complex is then targeted to a tumor or other cell by the 
specificity of the binding molecule for SEQ ID NO:1787-3572 and 5359-7144. 
1 5 The protein of the invention may also be expressed as a product of transgenic animals, 

e.g., as a component of the milk of transgenic cows, goats, pigs, or sheep which are characterized 
by somatic or germ cells containing a nucleotide sequence encoding the protein. 

The proteins provided herein also include proteins characterized by amino acid sequences 
similar to those of purified proteins but into which modification are naturally provided or 
20 deliberately engineered. For example, modifications, in the peptide or DNA sequence, can be 

made by those skilled in the art using known techniques. Modifications of interest in the protein 
sequences may include the alteration, substitution, replacement, insertion or deletion of a 
selected amino acid residue in the coding sequence. For example, one or more of the cysteine 
residues may be deleted or replaced with another amino acid to alter the conformation of the 
25 molecule. Techniques for such alteration, substitution, replacement, insertion or deletion are 
well known to those skilled in the art (see, e.g., U.S. Pat. No. 4,51 8,584). Preferably, such 
alteration, substitution, replacement, insertion or deletion retains the desired activity of the 
protein. Regions of the protein that are important for the protein function can be determined by 
various methods known in the art including the alanine-scanning method which involved 
30 systematic substitution of single or strings of amino acids with alanine, followed by testing the 
resulting alanine-containing variant for biological activity. This type of analysis determines the 
importance of the substituted amino acid(s) in biological activity. Regions of the protein that are 
important for protein function may be determined by the eMATRIX program. 

Other fragments and derivatives of the sequences of proteins which would be expected to. 
35 retain protein activity in whole or in part and are useful for screening or other immunological 
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methodologies may also be easily made by those skilled in the art given the disclosures herein. 
Such modifications are encompassed by the present invention. 

The protein may also be produced by operably linking the isolated polynucleotide of the 
invention to suitable control sequences in one or more insect expression vectors, and employing 
5 an insect expression system. Materials and methods for baculovirus/insect cell expression 

systems are commercially available in kit form from, e.g., Invitrogen, San Diego, Calif., U.S.A. 
(the MaxBat™ kit), and such methods are well known in the art, as described in Summers and 
Smith, Texas Agricultural Experiment Station Bulletin No. 1555 (1987), incorporated herein by 
reference. As used herein, an insect cell capable of expressing a polynucleotide of the present 
1 0 invention is "transformed." 

The protein of the invention may be prepared by culturing transformed host cells under 
culture conditions suitable to express the recombinant protein. The resulting expressed protein 
may then be purified from such culture (i.e., from culture medium or cell extracts) using known 
purification processes, such as gel filtration and ion exchange chromatography. The purification 
of the protein may also include an affinity column containing agents which will bind to the 
protein; one or more column steps over such affinity resins as concanavalin A-agarose, 
heparin-toyopearl™ or Cibacrom blue 3GA Sepharose™; one or more steps involving 
hydrophobic interaction chromatography using such resins as phenyl ether, butyl ether, or propyl 
ether; or immunoaffinity chromatography. 

Alternatively, the protein of the invention may also be expressed in a form which will 
facilitate purification. For example, it may be expressed as a fusion protein, such as those of 
maltose binding protein (MBP), glutathione-S-transferase (GST) or thioredoxin (TRX), or as a 
His tag. Kits for expression and purification of such fusion proteins are commercially available 
from New England BioLab (Beverly, Mass.), Pharmacia (Piscataway, N.J.) and Invitrogen, 
respectively. The protein can also be tagged with an epitope and subsequently purified by using 
a specific antibody directed to such epitope. One such epitope ("FLAG©") is commercially 
available from Kodak (New Haven, Conn.). 

Finally, one or more reverse-phase high performance liquid chromatography (RP- HPLC) 
steps employing hydrophobic RP-HPLC media, e.g., silica gel having pendant methyl or other 
aliphatic groups, can be employed to further purify the protein. Some or all of the foregoing 
purification steps, in various combinations, can also be employed to provide a substantially 
homogeneous isolated recombinant protein. The protein thus purified is substantially free of 
other mammalian proteins and is defined in accordance with the present invention as an "isolated 
protein." 



20 



25 



31 



0I53312AI I > 



PCT/USOO/34263 

WO *Tta polypeptides of the invention include analogs (variants). This embraces fragments, 
as well as peptides in which one or more amino acids has been deleted, inserted, or substituted. 
Also analogs of the polypeptides of the invention embrace fusions of the polypeptides or 
modifications of the polypeptides of the invention, wherein the polypeptide or analog is fused to 
5 another moiety or moieties, e.g., targeting moiety or another therapeutic agent. Such analogs 
may exhibit improved properties such as activity and/or stability. Examples ofmoiet.es which 
may be fused to the polypeptide or an analog include, for example, targeting moieties which 
provide for the delivery of polypeptide to pancreatic cells, e.g., antibodies to pancreatic cells, 
antibodies to immune cells such as T-cells, monocytes, dendritic cells, granulocytes, etc., as well 
10 as receptor and ligands expressed on pancreatic or immune cells. Other moieties winch may be 
fused to the polypeptide include therapeutic agents which are used for treatment, for example, 
immunosuppressive drugs such as cyclosporin, SK506, azathioprine, CD3 antibodies and 
steroids. Also, polypeptides may be fused to immune modulators, and other cytokmes such as 
alpha or beta interferon. 

4.6.1 DETERMINING POLYPEPTIDE AND POLYNUCLEOTIDE IDENTITY 
AND SIMILARITY 

Preferred identity and/or similarity are designed to give the largest match between the 
sequences tested. Methods to determine identity and similarity are codified in computer 
20 programs including, but are not limited to, the GCG program package, including GAP 

(Devereux, J., et al., Nucleic Acids Research 12(0-387 (1984); Genetics Computer Group, 
University of Wisconsin, Madison, WI), BLASTP, BLASTN, BLASTX, FASTA (Altschul, S.F. 
et al., J. Molec. Biol. 215:403-410 (1990), PS1-BLAST (Altschul S.F. et al., Nucleic Acids Res. 
vol 25, pp. 3389-3402, herein incorporated by reference), eMatrix software (Wu et al., J. Comp. 
25 Biol Vol. 6, pp. 219-235 (1999), herein incorporated by reference), eMotif software (Nevill- 

Manning et al, ISMB-97, Vol. 4, pp. 202-209, herein incorporated by reference), pFam software 
(Sonnhammer et al., Nucleic Acids Res., Vol. 26(1), pp. 320-322 (1998), herein incorporated by 
reference) and the Kyte-Doolittle hydrophobocity pred.ction algorithm (J. Mol Biol, 1 57, pp. 
105-31 (1982), incorporated herein by reference). The BLAST programs are publicly available 
30 from the National Center for Biotechnology Information (NCBI) and other sources (BLAST 
Manual, Altschul, S., et al. NCB NLM NIH Bethesda, MD 20894; Altschul, S., et al., J. Mol. 
Biol. 215:403-410 (1990). 

4.7 CHIMERIC AND FUSION PROTEINS it . ... . 

The invention also provides chimeric or fusion proteins. As used herein, a chimeric 

35 protein" or "fusion protein" comprises a polypeptide of the invention operativeiy linked to 
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another polypeptide. Within a fusion protein the polypeptide according to the invention can 
correspond to all or a portion of a protein according to the invention. In one embodiment, a 
fusion protein comprises at least one biologically active portion of a protein according to the 
invention. In another embodiment, a fusion protein comprises at least two biologically active 
5 portions of a protein according to the invention. Within the fusion protein, the term "operativcly 
linked" is intended to indicate that the polypeptide according to the invention and the other 
polypeptide are fused in-frame to each other. The polypeptide can be fused to the N-terminus or 
C-terminus. 

For example, in one embodiment a fusion protein comprises a polypeptide according to 

10 the invention operably linked to the extracellular domain of a second protein. 

ln another embodiment, the fusion protein is a GST- fusion protein in which the polypeptide 
sequences of the invention are fused to the C-terminus of the GST (i.e., glutathione 
S-transferase) sequences. 

In another embodiment, the fusion protein is an immunoglobulin fusion protein in which 

15 the polypeptide sequences according to the invention comprises one or more domains are fused 
to sequences derived from a member of the immunoglobulin protein family. The 
immunoglobulin fusion proteins of the invention can be incorporated into pharmaceutical 
compositions and administered to a subject to inhibit an interaction between a ligand and a 
protein of the invention on the surface of a cell, to thereby suppress signal transduction in vivo. 

20 The immunoglobulin fusion proteins can be used to affect the bioavailability of a cognate ligand. 
Inhibition of the ligand/protein interaction may be useful therapeutically for both the treatment of 
proliferative and differentiative disorders, e,g. f cancer as well as modulating (e.g., promoting or 
inhibiting) cell survival. Moreover, the immunoglobulin fusion proteins of the invention can be 
used as irnmunogens to produce antibodies in a subject, to purify ligands, and in screening assays 

25 to identify molecules that inhibit the interaction of a polypeptide of the invention with a ligand. 

A chimeric or fusion protein of the invention can be produced by standard recombinant 
DNA techniques. For example, DNA fragments coding for the different polypeptide sequences 
are ligated together in-frame in accordance with conventional techniques, e.g., by employing 
blunt-ended or stagger-ended termini for ligation, restriction enzyme digestion to provide for 

30 appropriate termini, filling-in of cohesive ends as appropriate, alkaline phosphatase treatment to 
avoid undesirable joining, and enzymatic ligation. In another embodiment, the fusion gene can 
be synthesized by conventional techniques including automated DNA synthesizers. 
Alternatively, PCR amplification of gene fragments can be carried out using anchor primers that 
give rise to complementary overhangs between two consecutive gene fragments that can 

35 subsequently be annealed and reamplilied to generate a chimeric gene sequence (see, for 
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example, Ausubel et al. (eds.) Current Protocols IN Molecular Biology, John Wiley & 
Sons, 1992). Moreover, many expression vectors are commercially available that already encode 
a fusion moiety (e.g., a GST polypeptide). A nucleic acid encoding a polypeptide of the 
invention can be cloned into such an expression vector such that the fusion moiety is linked 
in-frame to the protein of the invention. 

4.8 GENE THERAPY 

Mutations in the polynucleotides of the invention gene may result in loss of normal 
function of the encoded protein. The invention thus provides gene therapy to restore normal 
activity of the polypeptides of the invention; or to treat disease states involving polypeptides of 
the invention. Delivery of a functional gene encoding polypeptides of the invention to 
appropriate cells is effected ex vivo, in situ, or in vivo by use of vectors, and more particularly 
viral vectors (e.g., adenovirus, adeno-associated virus, or a retrovirus), or ex vivo by use of 
physical DNA transfer methods (e.g., liposomes or chemical treatments). See, for example, 
Anderson, Nature, supplement to vol. 392, no. 6679, pp.25-20 (1 998). For addiuonal reviews of 
gene therapy technology see Friedmann, Science, 244: 1275-1281 (1989); Verma, Scientific 
American: 68-84 (1990); and Miller, Nature, 357: 455-460 (1992). Introduction of any one of 
the nucleotides of the present invention or a gene encoding the polypeptides of the present 
invention can also be accomplished with extrachromosomal substrates (transient expression) or 
artificial chromosomes (stable expression). Cells may also be cultured ex vivo in the presence of 
proteins of the present invention in order to proliferate or to produce a desired effect on or 
activity in such cells. Treated cells can then be introduced in vivo for therapeutic purposes. 
Alternatively, it is contemplated that in other human disease states, preventing the expression of 
or inhibiting the activity of polypeptides of the invention will be useful in treating the disease 
states. It is contemplated that antisense therapy or gene therapy could be applied to negatively 
regulate the expression of polypeptides of the invention. 

Other methods inhibiting expression of a protein include the introduction of antisense 
molecules to the nucleic acids of the present invention, their complements, or their translated UNA 
sequences,by methods known in the art. Further, the polypeptides of the present invention can be 
inhibited by using targeted deletion methods, or the insertion of a negative regulatory element such 

as a silencer, which is tissue specific. 

The present invention still further provides cells genetically engineered in vivo to express the 
polynucleotides of the invention, wherein such polynucleotides are in operative association with a 
regulatory sequence heterologous to the host cell which drives expression of the polynucleotides in 
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the cell. These methods can be used to increase or decrease the expression of the polynucleotides of 
the present invention. 

Knowledge of DNA sequences provided by the invention allows for modification of cells to 
permit, increase, or decrease, expression of endogenous polypeptide. Cells can be modified (e.g., by 
5 homologous recombination) to provide increased polypeptide expression by replacing, in whole or 
in part, the naturally occurring promoter with all or part of a heterologous promoter so that the ceils 
express the protein at higher levels. The heterologous promoter is inserted in such a manner that it is 
operatively linked to the desired protein encoding sequences. See, for example, PCT International 
PublicationNo. WO 94/12650, PCT International Publication No. WO 92/20808, and PCT 

1 0 International PublicationNo. WO 91/09955. It is also contemplated that, in addition to heterologous 
promoter DNA, amplifiable marker DNA (e.g., ada, dhfr, and the multifunctional CAD gene which 
encodes carbamyi phosphate synthase, aspartate transcarbamyiase,and dihydroorotase) and/or 
intron DNA may be inserted along with the heterologous promoter DNA. If linked to the desired 
protein coding sequence, amplification of the marker DNA by standard selection methods results in 

1 5 co-amplification of the desired protein coding sequences in the cells. 

In another embodiment of the present invention, cells and tissues may be engineered to 
express an endogenous gene comprising the polynucleotides of the invention under the control of 
inducible regulatory elements, in which case the regulatory sequences of the endogenous gene may 
be replaced by homologous recombination. As described herein, gene targeting can be used to 

20 replace a gene's existing regulatory region with a regulatory sequence isolated from a different gene 
or a novel regulatory sequence synthesized by genetic engineering methods. Such regulatory 
sequences may be comprised of promoters, enhancers, scaffold- attachment regions, negative 
regulatory elements, transcriptional initiation sites, regulatory protein binding sites or combinations 
of said sequences. Alternatively, sequences which affect the structure or stability of the RNA or 

25 protein produced may be replaced, removed, added, or otherwise modified by targeting. These 

sequences include polyadenylation signals, mRNA stability elements, splice sites, leader sequences 
for enhancing or modifying transport or secretion properties of the protein, or other sequences 
which alter or improve the function or stability of protein or RNA molecules. 

The targeting event may be a simple insertion of the regulatory sequence, placing the gene 

30 under the control of the new regulatory sequence, e.g., inserting a new promoter or enhancer or both 
upstream of a gene. Alternatively, the targeting event may be a simple deletion of a regulatory 
element, such as the deletion of a tissue-specific negative regulatory element. Alternatively, the 
targeting event may replace an existing element; for example, a tissue-specific enhancer can be 
replaced by an enhancer that has broader or different cell-type specificity than the naturally 

3 5 occurring elements. Here, the naturally occurring sequences are deleted and new sequences are 
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added. In all cases, the identification of the targeting event may be facilitated by the use of one or 
more selectable marker genes that are contiguous with the targeting DNA, allowing for the selection 
of cells in which the exogenous DNA has integrated into the cell genome. The identification of the 
targeting event may also be facilitated by the use of one or more marker genes exhibiting the 
5 property of negative selection, such that the negatively selectable marker is linked to the exogenous 
DNA, but configured such that the negatively selectable marker flanks the targeting sequence, and 
such that a correct homologous recombination event with sequences in the host cell genome does 
not result in the stable integration of the negatively selectable marker. Markers useful for this 
purpose include the Herpes Simplex Virus thymidine kinase (TK) gene or the bacterial 
1 0 xanthine-guanine phosphoribosyl- transferase (gpt) gene. 

The gene targeting or gene activation techniques which can be used in accordance with this 
aspect of the invention are more particularly described in U.S. Patent No. 5,272,071 to Chappel; 
U.S. Patent No. 5,578,461 to Sherwinet al; International Application No. PCTAJS92/09627 
(WO93/09222)by Seldenet al.; and International Application No. PCT/US90/06436 
15 (WO91/06667)by Skoultchi et al., each of which is incorporated by reference herein in its entirety. 

4.9 TRANSGENIC ANIMALS 

In preferred methods to determine biological functions of the polypeptides of the 
invention in vivo, one or more genes provided by the invention are either over expressed or 

20 inactivated in the germ line of animals using homologous recombination [Capecchi, Science 
244:1288-1292 (1989)]. Animals in which the gene is over expressed, under the regulatory 
control of exogenous or endogenous promoter elements, are known as transgenic animals. 
Animals in which an endogenous gene has been inactivated by homologous recombination arc 
referred to as "knockout" animals. Knockout animals, preferably non-human mammals, can be 

25 prepared as described in U.S. Patent No. 5,557,032, incorporated herein by reference. Transgenic 
animals are useful to determine the roles polypeptides of the invention play in biological 
processes, and preferably in disease states. Transgenic animals are useful as model systems to 
identify compounds that modulate lipid metabolism. Transgenic animals, preferably non-human 
mammals, are produced using methods as described in U.S. Patent No 5,489,743 and PCT 

30 Publication No. W094/28 122, incorporated herein by reference. 

Transgenic animals can be prepared wherein all or part of a promoter of the 
polynucleotides of the invention is either activated or inactivated to alter the level of expression 
of the polypeptides of the invention. Inactivation can be carried out using homologous 
recombination methods described above. Activation can be achieved by supplementing or even 

35 replacing the homologous promoter to provide for increased protein expression. The homologous 
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promoter can be supplemented by insertion of one or more heterologous enhancer elements 
known to confer promoter activation in a particular tissue. 

The polynucleotides of the present invention also make possible the development, 
through, e.g., homologous recombination or knock out strategies, of animals that fail to express 
5 polypeptides of the invention or that express a variant polypeptide. Such animals are useful as 
models for studying the in vivo activities of polypeptide as well as for studying modulators of the 
polypeptides of the invention. 

In preferred methods to determine biological functions of the polypeptides of the 
invention in vivo, one or more genes provided by the invention are either over expressed or 

10 inactivated in the germ line of animals using homologous recombination [Capecchi, Science 
244:1288-1292 (1989)]. Animals in which the gene is over expressed, under the regulatory 
control of exogenous or endogenous promoter elements, are known as transgenic animals. 
Animals in which an endogenous gene has been inactivated by homologous recombination are 
referred to as "knockout" animals. Knockout animals, preferably non-human mammals, can be 

15 prepared as described in U.S. Patent No. 5,557,032, incorporated herein by reference. Transgenic 
animals are useful to determine the roles polypeptides of the invention play in biological 
processes, and preferably in disease states. Transgenic animals are useful as model systems to 
identify compounds that modulate lipid metabolism. Transgenic animals, preferably non-human 
mammals, are produced using methods as described in U.S. Patent No 5,489,743 and PCT 

20 Publication No. W094/28122, incorporated herein by reference. 

Transgenic animals can be prepared wherein all or part of the polynucleotides of the 
invention promoter is either activated or inactivated to alter the level of expression of the 
polypeptides of the invention. Inactivation can be carried out using homologous recombination 
methods described above. Activation can be achieved by supplementing or even replacing the 

25 homologous promoter to provide for increased protein expression. The homologous promoter 
can be supplemented by insertion of one or more heterologous enhancer elements known to 
confer promoter activation in a particular tissue. 

4.10 USES AND BIOLOGICAL ACTIVITY 

30 The polynucleotides and proteins of the present invention are expected to exhibit one or 

more of the uses or biological activities (including those associated with assays cited herein) 
identified herein. Uses or activities described for proteins of the present invention may be 
provided by administration or use of such proteins or of polynucleotides encoding such proteins 
(such as, for example, in gene therapies or vectors suitable for introduction of DNA). The 

35 mechanism underlying the particular condition or pathology will dictate whether the 
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polypeptides of the invention, the polynucleotides of the invention or modulators (acUvators or 
inhibitors) thereof would be beneficial to the subject in need of treatment. Thus, "therapeutic 
compositions of the invention" include compositions comprising isolated polynucleotides 
(including recombinant DNA molecules, cloned genes and degenerate variants thereof) or 
5 polypeptides of the invention (including full length protein, mature protein and truncations or 
domains thereof), or compounds and other substances that modulate the overall activity of the 
target gene products, either at the level of target gene/protein expression or target protein 
activity. Such modulators include polypeptides, analogs, (variants), including fragments and 
fusion proteins, antibodies and other binding proteins; chemical compounds that directly or 
0 indirectly activate or inhibit the polypeptides of the invention (identified, e.g., via drug screening 
assays as described herein); antisense polynucleotides and polynucleotides suitable for triple 
helix formation; and in particular antibodies or other binding partners that specifically recognize 
one or more epitopes of the polypeptides of the invention. 

The polypeptides of the present invention may likewise be involved in cellular activation 
15 or in one of the other physiological pathways described herein. 

4.10-1 RESEARCH USES AND UTILITIES 

The polynucleotides provided by the present invention can be used by the research 
community for various purposes. The polynucleotides can be used to express recombinant 
00 protein for analysis, characterization or therapeutic use; as markers for tissues in which the 

corresponding protein is preferentially expressed (either constitutively or at a particular stage of 
tissue differentiation or development or in disease states); as molecular weight markers on gels; 
as chromosome markers or tags (when labeled) to identify chromosomes or to map related gene 
positions; to compare with endogenous DNA sequences in patients to identify potential genetic 
25 disorders; as probes to hybridize and thus discover novel, related DNA sequences; as a source of 
information to derive PCR primers for genetic fingerprinting; as a probe to "subtract-out" known 
sequences in the process of discovering other novel polynucleotides; for selecting and making 
oligomers for attachment to a "gene chip" or other support, including for examination of 
expression patterns; to raise anti-protein antibod.es using DNA immunization techniques; and as 
30 an antigen to raise anti-DNA antibodies or elicit another immune response. Where the 

polynucleotide encodes a protein which binds or potentially binds to another protein (such as, for 
example in a receptor-ligand interaction), the polynucleotide can also be used in interaction trap 
assays (such as, for example, that described in Gyuris et al., Cell 75:791-803 (1993)) to identify 
polynucleotides encoding the other protein with which binding occurs or to identify mh.bitors of 
35 the binding interaction. 
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The polypeptides provided by the present invention can similarly be used in assays to 
determine biological activity, including in a panel of multiple proteins for high-throughput 
screening; to raise antibodies or to elicit another immune response; as a reagent (including the 
labeled reagent) in assays designed to quantitatively determine levels of the protein (or its 
5 receptor) in biological fluids; as markers for tissues in which the corresponding polypeptide is 
preferentially expressed (either constitutively or at a particular stage of tissue differentiation or 
development or in a disease state); and, of course, to isolate correlative receptors or ligands. 
Proteins involved in these binding interactions can also be used to screen for peptide or small 
molecule inhibitors or agonists of the binding interaction. 

1 0 Any or all of these research utilities are capable of being developed into reagent grade or 

kit format for commercialization as research products. 

Methods for performing the uses listed above are well known to those skilled in the art. 
References disclosing such methods include without limitation "Molecular Cloning: A 
Laboratory Manual", 2d ed., Coid Spring Harbor Laboratory Press, Sambrook, L, E. F. Fritsch 

15 and T. Maniatis eds., 1989, and "Methods in Enzymology: Guide to Molecular Cloning 
Techniques", Academic Press, Berger, S. L. and A. R. Kimmel eds., 1987. 

4.10.2 NUTRITIONAL USES 

Polynucleotides and polypeptides of the present invention can also be used as nutritional 
20 sources or supplements. Such uses include without limitation use as a protein or amino acid 

supplement, use as a carbon source, use as a nitrogen source and use as a source of carbohydrate. In 
such cases the polypeptide or polynucleotide of the invention can be added to the feed of a 
particular organism or can be administered as a separate solid or liquid preparation, such as in the 
form of powder, pills, solutions, suspensions or capsules. In the case of microorganisms, the 
25 polypeptide or po lynucleotide of the invention can be added to the medium in or on which the 
microorganism is cultured. 

4.103 CYTOKINE AND CELL PROLIFERATION/DIFFERENTIATION 
ACTIVITY 

30 A polypeptide of the present invention may exhibit activity relating to cytokine, cell 

proliferation (either inducing or inhibiting) or cell differentiation (either inducing or inhibiting) 
activity or may induce production of other cytokines in certain cell populations. A 
polynucleotide of the invention can encode a polypeptide exhibiting such attributes. Many 
protein factors discovered to date, including all known cytokines, have exhibited activity in one 

35 or more factor-dependent cell proliferation assays, and hence the assays serve as a convenient 
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confirmation of cytokine activity. The activity of therapeutic compositions of the present 
invention is evidenced by any one of a number of routine factor dependent cell proliferation 
assays for cell lines including, without limitation, 32D, DA2, DA1G, T10, B9, B9/1 1, BaF3, 
MC9/G, M+(preB M+), 2E8, RB 5 , D A 1 , 1 23 , T It 65 , HT2, CTLL2, TF - 1 , Mo7e, CMK, 
5 HUVEC, and Caco. Therapeutic compositions of the invention can be used in the following: 

Assays for T-cell or thymocyte proliferation include without limitation those described 
in: Current Protocols in Immunology, Ed by J. E. Coligan, A. M. Kruisbeek, D. H. Margulies, E. 
M. Shevach, W. Strober, Pub. Greene Publishing Associates and Wiley-Interscience (Chapter 3, 
In Vitro assays for Mouse Lymphocyte Function 3.1-3.19; Chapter 7, Immunologic studies in 
10 Humans); Takai et al., J. Immunol. 137:3494-3500, 1986; Bertagnolli et ah, J. Immunol. 

145:1706-1712, 1990; Bertagnolli et al., Cellular Immunology 133:327-341, 1991; Bertagnolli, 
et al., I. Immunol. 149:3778-3783, 1992; Bowman et al., L Immunol. 152:1756-1761, 1994. 

Assays for cytokine production and/or proliferation of spleen cells, lymph node cells or 
thymocytes include, without limitation, those described in: Polyclonal T cell stimulation, 
1 5 Kruisbeek, A. M. and Shevach, E. M. In Current Protocols in Immunology. J. E. e.a. Coligan 

eds. Vol 1 pp. 3.12.1-3.12.14, John Wiley and Sons, Toronto. 1994; and Measurement of mouse 
and human interleukin-y, Schreiber, R. D. In Current Protocols in Immunology. J. E. e.a. Coligan 
eds. Vol 1 pp. 6.8.1-6.8.8, John Wiley and Sons, Toronto. 1994. 

Assays for proliferation and differentiation of hematopoietic and lymphopoietic cells 
20 include, without limitation, those described in: Measurement of Human and Murine Interleukin 2 
and Interleukin 4, Bottomly, K., Davis, L. S. and Lipsky, P. E. In Current Protocols in 
Immunology. J. E. e.a. Coligan eds. Vol 1 pp. 6.3.1-6.3.12, John Wiley and Sons, Toronto. 1991; 
dcVries et al., J. Exp. Med. 173:1205-1211, 1991; Moreau et al., Nature 336:690-692, 1988; 
Greenberger et al., Proc. Natl. Acad. Sci. U.S.A. 80:2931-2938, 1983; Measurement of mouse 
25 and human interleukin 6-Nordan, R. In Current Protocols in Immunology. J. E. Coligan eds. Vol 
1 pp. 6.6.1-6.6.5, John Wiley and Sons, Toronto. 1991; Smith et al., Proc. Natl. Aced. Sci. 
U.S.A. 83:1 857-1861, 1986; Measurement of human Interleukin 1 1 --Bennett F., Giarinotti, J., 
Clark, S. C. and Turner, K. J. In Current Protocols in immunology. J. E. Coligan eds. Vol 1 pp. 
6.15.1 John Wiley and Sons, Toronto. 1991; Measurement of mouse and human Interleukin 
30 9-Ciarletta, A., Giannotti, J., Clark, S. C. and Turner, K. J. In Current Protocols in Immunology. 
J. E. Coligan eds. Vol 1 pp. 6.13.1, John Wiley and Sons, Toronto. 1991. 

Assays for T-cell clone responses to antigens (which will identify, among others, proteins 
that arlect APC-T cell interactions as well as direct T-cell effects by measuring proliferation and 
cytokine production) include, without limitation, those described in: Current Protocols in 
35 Immunology, Ed by J. E. Coligan, A. M. Kruisbeek, D. H. Margulies, E. M. Shevach, W Strober, 
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Pub. Greene Publishing Associates and Wiley-Interscience (Chapter 3, In Vitro assays for Mouse 
Lymphocyte Function; Chapter 6, Cytokines and their cellular receptors; Chapter 7, 
Immunologic studies in Humans); Weinberger et al., Proc. Natl. Acad. Sci. USA 77:6091-6095, 
1980; Weinberger et al., Eur. J. Immun. 11:405-41 1, 1981; Takai et al., J. Immunol. 
5 137:3494-3500, 1986; Takai et al., J. Immunol. 140:508-512, 1988. 

4.10.4 STEM CELL GUOWTH FACTOR ACTIVITY 

A polypeptide of the present invention may exhibit stem cell growth factor activity and 
be involved in the proliferation, differentiation and survival of pluripotent and totipotent stem 

10 cells including primordial germ cells, embryonic stem cells, hematopoietic stem cells and/or 

germ line stem cells. Administration of the polypeptide of the invention to stem cells in vivo or 
ex vivo is expected to maintain and expand cell populations in a totipotent al or pluripotential 
state which would be useful for re-engineering damaged or diseased tissues, transplantation, 
manufacture of bio-pharmaceuticals and the development of bio-sensors. The ability to produce 

15 large quantities of human cells has important working applications for the production of human 
proteins which currently must be obtained from non-human sources or donors, implantation of 
cells to treat diseases such as Parkinson's, Alzheimer's and other neurodegenerative diseases; 
tissues for grafting such as bone marrow, skin, cartilage, tendons, bone, muscle (including 
cardiac muscle), blood vessels, cornea, neural cells, gastrointestinal cells and others; and organs 

20 for transplantation such as kidney, liver, pancreas (including islet cells), heart and lung. 

It is contemplated that multiple different exogenous growth factors and/or cytokines may 
be administered in combination with the polypeptide of the invention to achieve the desired 
effect, including any of the growth factors listed herein, other stem cell maintenance factors, and 
specifically including stem cell factor (SCF), leukemia inhibitory factor (L1F), Flt~3 ligand (Flt- 

25 3L), any of the interleukins, recombinant soluble IL-6 receptor fused to IL-6, macrophage 

inflammatory protein 1 -alpha (MIP-1 -alpha), G-CSF, GM-CSF, thrombopoietin (TPO), platelet 
factor 4 (PF-4), platelet-derived growth factor (PDGF), neural growth factors and basic fibroblast 
growth factor (bFGF). 

Since totipotent stem cells can give rise to virtually any mature cell type, expansion of 
30 these cells in culture will facilitate the production of large quantities of mature cells. Techniques 
for culturing stem cells are known in the art and administration of po\ypeptides of the invention, 
optionally with other growth factors and/or cytokines, is expected to enhance the survival and 
proliferation of the stem cell populations. This can be accomplished by direct administration of 
the polypeptide of the invention to the culture medium. Alternatively, stroma cells transfected 
35 with a polynucleotide that encodes for the polypeptide of the invention can be used as a feeder 
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layer for the stem cell populations in culture or in vivo. Stromal support cells for feeder layers 
may include embryonic bone marrow fibroblasts, bone marrow stromal cells, fetal liver cells, or 
cultured embryonic fibroblasts (see U.S. Patent No. 5,690,926). 

Stem cells themselves can be transfected with a polynucleotide of the invention to induce 
5 autocrine expression of the polypeptide of the invention. This will allow for generation of 

undifferentiated totipotential/pluripotential stem cell lines that are useful as is or that can then be 
differentiated into the desired mature cell types. These stable cell lines can also serve as a source 
of undifferentiated totipotential/pluripotential mRNA to create cDNA libraries and templates for 
polymerase chain reaction experiments. These studies would allow for the isolation and 
1 0 identification of differentially expressed genes in stem cell populations that regulate stem cell 

proliferation and/or maintenance. 

Expansion and maintenance of totipotent stem cell populations will be useful in the 
treatment of many pathological conditions. For example, polypeptides of the present invention 
may be used to manipulate stem cells in culture to give rise to neuroepithelial cells that can be 
1 5 used to augment or replace cells damaged by illness, autoimmune disease, accidental damage or 
genetic disorders. The polypeptide of the invention may be useful for inducing the proliferation 
of neural cells and for the regeneration of nerve and brain tissue, i.e. for the treatment of central 
and peripheral nervous system diseases and neuropathies, as well as mechanical and traumatic 
disorders which involve degeneration, death or trauma to neural cells or nerve tissue. In addition, 
20 the expanded stem cell populations can also be genetically altered for gene therapy purposes and 
to decrease host rejection of replacement tissues after grafting or implantation. 

Expression of the polypeptide of the invention and its effect on stem cells can also be 
manipulated to achieve controlled differentiation of the stem cells into more differentiated cell 
types. A broadly applicable method of obtaining pure populations of a specific differentiated 
25 cell type from undifferentiated stem cell populations involves the use of a cell-type specific 

promoter driving a selectable marker. The selectable marker allows only cells of the desired type 
to survive. For example, stem cells can be induced to differentiate into cardiomyocytes (Wobus 
et al., Differentiation, 48: 173-182, (1991); Klug et al., J. Clin. Invest., 98(1): 216-224, (1998)) 
or skeletal muscle cells (Browder, L. W. In: Principles of Tissue Engineering eds. Lanza et al., 
3 0 Academic Press ( 1 997)). Alternatively, directed differentiation of stem cells can be 

accomplished by culturing the stem cells in the presence of a differentiation factor such as 
retinoic acid and an antagonist of the polypeptide of the invention which would inhibit the 
effects of endogenous stem cell factor activity and allow differentiation to proceed. 

In vitro cultures of stem cells can be used to determine if the polypeptide of the invention 
35 exhibits stem cell growth factor activity. Stem cells are isolated from any one of various cell 
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sources (including hematopoietic stem cells and embryonic stem cells) and cultured on a feeder 
layer, as described by Thompson et al. Proc. Natl. Acad. Sci, U.S.A., 92: 7844-7848 (1995), in 
the presence of the polypeptide of the invention alone or in combination with other growth 
factors or cytokines. The ability of the polypeptide of the invention to induce stem cells 
5 proliferation is determined by colony formation on semi-solid support e.g. as described by 
Bernstein et al., Blood, 77: 2316-2321 (1991). 

4.10.5 HEMATOPOIESIS REGULATING ACTIVITY 

A polypeptide of the present invention may be involved in regulation of hematopoiesis 
10 and, consequently, in the treatment of myeloid or lymphoid cell disorders. Even marginal 

biological activity in support of colony forming cells or of factor-dependent cell lines indicates 
involvement in regulating hematopoiesis, e.g. in supporting the growth and proliferation of 
erythroid progenitor cells alone or in combination with other cytokines, thereby indicating utility, 
for example, in treating various anemias or for use in conjunction with irradiation/chemotherapy 

15 to stimulate the production of erythroid precursors and/or erythroid cells; in supporting the 

growth and proliferation of myeloid cells such as granulocytes and monocytes/macrophages (i.e., 
traditional CSF activity) useful, for example, in conjunction with chemotherapy to prevent or 
treat consequent myelo-suppression; in supporting the growth and proliferation of 
megakaryocytes and consequently of platelets thereby allowing prevention or treatment of 

20 various platelet disorders such as thrombocytopenia, and generally for use in place of or 

complimentary to platelet transfusions; and/or in supporting the growth and proliferation of 
hematopoietic stem cells which are capable of maturing to any and all of the above-mentioned 
hematopoietic cells and therefore find therapeutic utility in. various stem cell disorders (such as 
those usually treated with transplantation, including, without limitation, aplastic anemia and 

25 paroxysmal nocturnal hemoglobinuria), as well as in repopulating the stem cell compartment 
post iiTadiation/chemotherapy, either in-vivo or ex- vivo (i.e., in conjunction with bone marrow 
transplantation or with peripheral progenitor cell transplantation (homologous or heterologous)) 
as normal cells or genetically manipulated for gene therapy. 

Therapeutic compositions of the invention can be used in the following: 

30 Suitable assays for proliferation and differentiation of various hematopoietic lines are 

cited above. 

Assays for embryonic stern cell differentiation (which will identify, among others, 
proteins that influence embryonic differentiation hematopoiesis) include, without limitation, 
those described in: Johansson et al. Cellular Biology 15:141-151, 1995; Keller et ah, Molecular 
35 and Cellular Biology 13:473-486, 1993; McClanahan et al., Blood 81:2903-291 5, 1993. 
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Assays for stem cell survival and differentiation (which will identify, among others, 
proteins that regulate lympho-hematopoiesis) include, without limitation, those described in: 
Methylcellulose colony forming assays, Freshney, M. G- In Culture of Hematopoietic Cells. R. I. 
Freshney, et al. eds. Vol pp. 265-268, Wiley-Liss, Inc., New York, N.Y. 1994; Hirayama et a.., 
5 Proc Natl Acad. Sci. USA 89:5907-591 1, 1992; Primitive hematopoietic colony forming cells 
with high proliferative potential, McNiece, I. K. and Briddell, R. A. In Culture of Hematopoietic 
Cells. R. I. Freshney, et al. eds. Vol pp. 23-39, Wiley-Liss, Inc., New York, N.Y. 1 994; Neben et 
al., Experimental Hematology 22:353-359, 1994; Cobblestone area forming cell assay, 
Ploemacher, R. E. In Culture of Hematopoietic Cells. R. I. Freshney, et al. eds. Vol pp. 1-21, 
10 Wiley-Liss, Inc., New York, N.Y. 1 994; Long term bone marrow cultures in the presence of 
stromal cells, Spooncer, E., Dexter, M. and Allen, T. In Culture of Hematopoietic Cells. R. I- 
Freshney, et al. eds. Vol pp. 163-179, Wiley-Liss, Inc., New York, N.Y. 1994; Long term culture 
initiating cell assay, Sutherland, H. J. In Culture of Hematopoietic Cells. R. I. Freshney, et al. 
eds. Vol pp. 139-162, Wiley-Liss, Inc., New York, N.Y. 1994. 

4.10.6 TISSUE GROWTH ACTIVITY 

A polypeptide of the present invention also may be involved in bone, cartilage, tendon, 
hgament and/or nerve tissue growth or regeneration, as well as in wound healing and tissue 
repair and replacement, and in healing of bums, incisions and ulcers. 
20 A polypeptide of the present invention which induces cartilage and/or bone growth in 

circumstances where bone is not normally formed, has application in the healing of bone 
fractures and cartilage damage or defects in humans and other animals. Compositions of a 
polypeptide, antibody, binding partner, or other modulator of the invention may have 
prophylactic use in closed as well as open fracture reduction and also in the improved fixation of 
25 artificial joints. De novo bone formation induced by an osteogenic agent contributes to the repair 
of congenital, trauma induced, or oncologic resection induced craniofacial defects, and also ts 

useful in cosmetic plastic surgery. 

A poly peptide of this invention may also be involved in attracting bone-forming cells, 
stimulating growth of bone-forming cells, or inducing differentiation of progenitors of 
30 bone-forming cells. Treatment of osteoporosis, osteoarthritis, bone degenerative disorders, or 
periodontal disease, such as through stimulation of bone and/or cartilage repair or by blockmg 
inflammation or processes of tissue destruction (collagenase activity, osteoclast activity, etc.) 
mediated by inflammatory processes may also be possible using the composition of the 
invention. 
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Another category of tissue regeneration activity that may involve the polypeptide of the 
present invention is tendon/ligament formation. Induction of tendon/ligament-like tissue or 
other tissue formation in circumstances where such tissue is not normally formed, has application 
in the healing of tendon or ligament tears, deformities and other tendon or ligament defects in 
5 humans and other animals. Such a preparation employing a tendon/ligament- like tissue inducing 
protein may have prophylactic use in preventing damage to tendon or ligament tissue, as well as 
use in the improved fixation of tendon or ligament to bone or other tissues, and in repairing 
defects to tendon or ligament tissue. De novo tendon/ligament- like tissue formation induced by 
a composition of the present invention contributes to the repair of congenital, trauma induced, or 
1 0 other tendon or ligament defects of other origin, and is also useful in cosmetic plastic surgery for 
attachment or repair of tendons or ligaments. The compositions of the present invention may 
provide environment to attract tendon- or ligament-forming cells, stimulate growth of tendon- or 
ligament-forming cells, induce differentiation of progenitors of tendon- or ligament-forming 
cells, or induce growth of tendon/ligament cells or progenitors ex vivo for return in vivo to effect 
1 5 tissue repair. The compositions of the invention may also be useful in the treatment of tendinitis, 
carpal tunnel syndrome and other tendon or ligament defects. The compositions may also include 
an appropriate matrix and/or sequestering agent as a carrier as is well known in the art. 

The compositions of the present invention may also be useful for proliferation of neural 
cells and for regeneration of nerve and brain tissue, i.e. for the treatment of central and peripheral 
20 nervous system diseases and neuropathies, as well as mechanical and traumatic disorders, which 
involve degeneration, death or trauma to neural cells or nerve tissue. More specifically, a 
composition may be used in the treatment of diseases of the peripheral nervous system, such as 
peripheral nerve injuries, peripheral neuropathy and localized neuropathies, and central nervous 
system diseases, such as Alzheimer's, Parkinson's disease, Huntington's disease, amyotrophic 
25 lateral sclerosis, and Shy-Drager syndrome. Further conditions which may be treated in 

accordance with the present invention include mechanical and traumatic disorders, such as spinal 
cord disorders, head trauma and cerebrovascular diseases such as stroke. Peripheral neuropathies 
resulting from chemotherapy or other medical therapies may also be treatable using a 
composition of the invention. 
30 Compositions of the invention may also be useful to promote better or faster closure of 

non-healing wounds, including without limitation pressure ulcers, ulcers associated with vascular 
insufficiency, surgical and traumatic wounds, and the like. 

Compositions of the present invention may also be involved in the generation or 
regeneration of other tissues, such as organs (including, for example, pancreas, liver, intestine, 
35 kidney, skin, endothelium), muscle (smooth, skeletal or cardiac) and vascular (including vascular 
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endothelium) tissue, or for promoting the growth of cells comprising such tissues. Part of the 
desired effects may be by inhibition or modulation of fibrotic scarring may allow normal tissue 
to regenerate. A polypeptide of the present invention may also exhibit angiogenic activity. 
A composition of the present invention may also be useful for gut protection or 
5 regeneration and treatment of lung or liver fibrosis, reperfusion injury in various tissues, and 
conditions resulting from systemic cytokine damage. 

A composition of the present invention may also be useful for promoting or inhibiting 
differentiation of tissues described above' from precursor tissues or cells; or for inhibiting the 
growth of tissues described above. 
\ o Therapeutic compositions of the invention can be used in the following: 

Assays for tissue generation activity include, without limitation, those described in: 
International Patent Publication No. WO95/16035 (bone, cartilage, tendon); International Patent 
Publication No. WO95/05846 (nerve, neuronal); International Patent Publication No. 

WO91/07491 (skin, endothelium). 
1 5 Assays for wound healing activity include, without limitation, those described in: Winter, 

Epidermal Wound Healing, pps. 71-1 12 (Maibach, H. I. and Rovee, D. T., eds.), Year Book 
Medical Publishers, Inc., Chicago, as modified by Eaglstein and Mertz, J. Invest. Dermatol 
71:382-84(1978). 

20 4.10.7 IMMUNE STIMULATING OR SUPPRESSING ACTIVITY 

A polypeptide of the present invention may also exhibit immune stimulating or immune 
suppressing activity, including without limitation the activities for which assays are described 
herein. A polynucleotide of the invention can encode a polypeptide exhibiting such activities. A 
protein may be useful in the treatment of various immune deficiencies and disorders (including 

25 severe combined immunodeficiency (SCID)), e.g., in regulating (up or down) growth and 

proliferation of T and/or B lymphocytes, as well as effecting the cytolytic activity of NIC cells 
and other cell populations. These immune deficiencies may be genetic or be caused by viral (e.g., 
HIV) as well as bacterial or fungal infections, or may result from autoimmune disorders. More 
specifically, infectious diseases causes by viral, bacterial, fungal or other infection may be 

30 treatable using a protein of the present invention, including infections by HIV, hepatitis viruses, 
herpes viruses, mycobacteria, Leishmania spp. 5 malaria spp. and various fungal infections such 
as candidiasis. Of course, in this regard, proteins of the present invention may also be useful 
where a boost to the immune system generally may be desirable, i.e., in the treatment of cancer. 
Autoimmune disorders which may be treated using a protein of the present invention 

35 include, for example, connective tissue disease, multiple sclerosis, systemic lupus erythematosus, 
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rheumatoid arthritis, autoimmune pulmonary inflammation, Guillain-Barre syndrome, 
autoimmune thyroiditis, insulin dependent diabetes mellitis, myasthenia gravis, graft-versus-host 
disease and autoimmune inflammatory eye disease. Such a protein (or antagonists thereof, 
including antibodies) of the present invention may also to be useful in the treatment of allergic 
5 reactions and conditions (e.g., anaphylaxis, serum sickness, drug reactions, food allergies, insect 
venom allergies, mastocytosis, allergic rhinitis, hypersensitivity pneumonitis, urticaria, 
angioedema, eczema, atopic dermatitis, allergic contact dermatitis, erythema multiforme, 
Stevens- Johnson syndrome, allergic conjunctivitis, atopic keratoconjunctivitis, venereal 
keratoconjunctivitis, giant papillary conjunctivitis and contact allergies), such as asthma 

10 (particularly allergic asthma) or other respiratory problems. Other conditions, in which immune 
suppression is desired (including, for example, organ transplantation), may also be treatable 
using a protein (or antagonists thereof) of the present invention. The therapeutic effects of the 
polypeptides or antagonists thereof on allergic reactions can be evaluated by in vivo animals 
models such as the cumulative contact enhancement test (Lastbom et a!., Toxicology 125: 59-66, 

15 1998), skin prick test (Hoffmann et al., Allergy 54: 446-54, 1999), guinea pig skin sensitization 
test (Vohr et al., Arch. Toxocol. 73: 501-9), and murine local lymph node assay (Kimber et al., 
J. Toxicol. Environ. Health 53: 563-79). 

Using the proteins of the invention it may also be possible to modulate immune 
responses, in a number of ways. Down regulation may be in the form of inhibiting or blocking an 

20 immune response already in progress or may involve preventing the induction of an immune 

response. The functions of activated T cells may be inhibited by suppressing T ceil responses or 
by inducing specific tolerance in T cells, or both. Immunosuppression of T cell responses is 
generally an active, non-antigen-specific, process which requires continuous exposure of the T 
cells to the suppressive agent. Tolerance, which involves inducing non-responsiveness or anergy 

25 in T cells, is distinguishable from immunosuppression in that it is generally antigen-specific and 
persists after exposure to the tolerizing agent has ceased. Operationally, tolerance can be 
demonstrated by the lack of a T cell response upon reexposure to specific antigen in the absence 
of the tolerizing agent. 

Down regulating or preventing one or more antigen functions (including without 
30 limitation B lymphocyte antigen functions (such as, for example, B7)), e.g., preventing high 
level lymphokine synthesis by activated T cells, will be useful in situations of tissue, skin and 
organ transplantation and in graft-versus-host disease (GVHD). For example, blockage of T cell 
function should result in reduced tissue destruction in tissue transplantation. Typically, in tissue 
transplants, rejection of the transplant is initiated through its recognition as foreign by T cells, 
35 followed by an immune reaction that destroys the transplant. The administration of a therapeutic 
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composition of the invention may prevent cytokine synthesis by immune cells, such as T cells, 
and thus acts as an immunosuppressant. Moreover, a lack of costimulation may also be sufficient 
to anergize the T cells, thereby inducing tolerance in a subject. Induction of long-term tolerance 
by B lymphocyte antigen-blocking reagents may avoid the necessity of repeated administration 
5 of these blocking reagents. To achieve sufficient immunosuppression or tolerance in a subject, it 
may also be necessary to block the function of a combination of B lymphocyte antigens. 

The efficacy of particular therapeutic compositions in preventing organ transplant 
rejection or GVHD can be assessed using animal models that are predictive of efficacy in 
humans. Examples of appropriate systems which can be used include allogeneic cardiac grafts in 
1 0 rats and xenogeneic pancreatic islet cell grafts in mice, both of which have been used to examine 
the immunosuppressive effects of CTLA4Ig fusion proteins in vivo as described in Lenschow et 
aL, Science 257:789-792 (1992) and Turka et a!., Proc. Natl. Acai Sci USA, 89: 1 1 102-1 1 105 
(1992). In addition, murine models of GVHD (see Paul ed., Fundamental Immunology, Raven 
Press, New York, 1989, pp. 846-847) can be used to determine the effect of therapeutic 
1 5 compositions of the invention on the development of that disease. 

Blocking antigen function may also be therapeutically useful for treating autoimmune 
diseases. Many autoimmune disorders are the result of inappropriate activation of T cells that are 
reactive against self tissue and which promote the production of cytokines and autoantibodies 
involved in the pathology of the diseases. Preventing the activation of autoreactive T cells may 
20 reduce or eliminate disease symptoms. Administration of reagents which block stimulation of T 
cells can be used to inhibit T cell activation and prevent production of autoantibodies or T 
cell-derived cytokines which may be involved in the disease process. Additionally, blocking 
reagents may induce antigen-specific tolerance of autoreactive T cells which could lead to 
long-term relief from the disease. The efficacy of blocking reagents in preventing or alleviating 
25 autoimmune disorders can be determined using a number of well-characterized animal models of 
human autoimmune diseases. Examples include murine experimental autoimmune encephalitis, 
systemic lupus erythmatosis in MRL/lpr/lpr mice or NZB hybrid mice, murine autoimmune 
collagen arthritis, diabetes mellitus in NOD mice and BB rats, and murine experimental 
myasthenia gravis (see Paul ed., Fundamental Immunology, Raven Press, New York, 1989, pp. 
30 840-856). 

Upregulation of an antigen function (e.g., a B lymphocyte antigen function), as a means 
of up regulating immune responses, may also be useful in therapy. Upregulation of immune 
responses may be in the form of enhancing an existing immune response or eliciting an initial 
immune response. For example, enhancing an immune response may be useful in cases of viral 
35 infection, including systemic viral diseases such as influenza, the common cold, and encephalitis. 
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Alternatively, anti-viral immune responses may be enhanced in an infected patient by 
removing T cells from the patient, costimulating the T cells in vitro with viral antigen-pulsed 
APCs either expressing a peptide of the present invention or together with a stimulatory form of 
a soluble peptide of the present invention and reintroducing the in vitro activated T cells into the 
5 patient. Another method of enhancing anti-viral immune responses would be to isolate infected 
cells from a patient, transfect them with a nucleic acid encoding a protein of the present 
invention as described herein such that the cells express all or a portion of the protein on their 
surface, and reintroduce the transfected cells into the patient. The infected cells would now be 
capable of delivering a costimulatory signal to, and thereby activate, T cells in vivo. 

10 A polypeptide of the present invention may provide the necessary stimulation signal to T 

cells to induce a T cell mediated immune response against the transfected tumor cells. In 
addition, tumor cells which lack MHC class I or MHC class II molecules, or which fail to 
reexpress sufficient mounts of MHC class I or MHC class II molecules, can be transfected with 
nucleic acid encoding all or a portion of (e.g., a cytoplasmic-domain truncated portion) of an 

1 5 MHC class I alpha chain protein and 02 microglobulin protein or an MHC class II alpha chain 
protein and an MHC class II beta chain protein to thereby express MHC class I or MHC class II 
proteins on the cell surface. Expression of the appropriate class I or class II MHC in conjunction 
with a peptide having the activity of a B lymphocyte antigen (e.g., B7-1, B7-2, B7-3) induces a T 
cell mediated immune response against the transfected tumor cell. Optionally, a gene encoding 

20 an antisense construct which blocks expression of an MHC class II associated protein, such as 
the invariant chain, can also be cotransfected with a DN A encoding a peptide having the activity 
of a B lymphocyte antigen to promote presentation of tumor associated antigens and induce 
tumor specific immunity. Thus, the induction of a T cell mediated immune response in a human 
subject may be sufficient to overcome tumor-specific tolerance in the subject. 

25 The activity of a protein of the invention may, among other means, be measured by the 

following methods: 

Suitable assays for thymocyte or splenocyte cytotoxicity include, without limitation, 
those described in: Current Protocols in Immunology, Ed by J. E. Coligan, A. M. Kruisbeek, D. 
H. Margulies, E. M. Shevach, W. Strober, Pub. Greene Publishing Associates and 

30 Wiley-Interscience (Chapter 3, In Vitro assays for Mouse Lymphocyte Function 3.1-3.19; 
Chapter 7, Immunologic studies in Humans); Herrmann et al., Proc. Natl. Acad. Sci. USA 
78:2488-2492, 1981; Herrmann et aL, J. Immunol. 128:1968-1974, 1982; Handa et al., J. 
Immunol. 135:1564-1572, 1985; Takai et al., I. Immunol. 137:3494-3500, 1986; Takai et al., J. 
Immunol. 140:508-512, 1988; Bowman et al., J. Virology 61:1992-1998; Bertagnolli et al., 

35 Cellular Immunology 133:327-341, 1991; Brown et al., J. Immunol. 153:3079-3092, 1994. 
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Assays for T-cell-dependent immunoglobulin responses and isotype switching (which 
will identify, among others, proteins that modulate T-cell dependent antibody responses and that 
affect Thl/Th2 profiles) include, without limitation, those described in: Maliszewski, J. 
Immunol. 144:3028-3033, 1990; and Assays for B cell function: In vitro antibody production, 
> Mond, J. J. and Brunswick, M. In Current Protocols in Immunology. J. E. e.a. Coligan eds. Vol 1 
pp. 3.8.1-3.8.16, John Wiley and Sons, Toronto. 1994. 

Mixed lymphocyte reaction (MLR) assays (which will identify, among others, proteins 
that generate predominantly Thl and CTL responses) include, without limitation, those described 
in: Current Protocols in Immunology, Ed by J. E. Coligan, A. M. Kruisbeek, D. H. Marguhes, E. 
0 M. Shevach, W. Strober, Pub. Greene Publishing Associates and Wiley-Interscience (Chapter 3, 
In Vitro assays for Mouse Lymphocyte Function 3.1-3.19; Chapter 7, Immunologic studies in 
Humans); Takai et al., J. Immunol. 137:3494-3500, 1986; Takai et al., J. Immunol. 140:508-512, 
1988; Bertagnolli et al., J. Immunol. 149:3778-3783, 1992. 

Dendritic cell-dependent assays (which will identify, among others, proteins expressed by 
5 dendritic cells that activate naive T-cells) include, without limitation, those described in: Guery 
et al., J. Immunol. 134:536-544, 1995; Inaba et al., Journal of Experimental Medicine 
173:549-559, 1991; Macatonia et al., Journal of Immunology 154:5071-5079, 1995; Porgador et 
al., Journal of Experimental Medicine 182:255-260, 1995; Nair et al., Journal of Virology 
67:4062-4069, 1993; Huang et al., Science 264:961-965, 1994; Macatonia et al., Journal of 
10 Experimental Medicine 169:1255-1264, 1989; Bhardwaj et al., Journal of Clinical Investigation 
94:797-807, 1994; and Inaba et al., Journal of Experimental Medicine 172:631-640, 1990. 

Assays for lymphocyte survival/apoptosis (which will identify, among others, proteins 
that prevent apoptosis after superantigen induction and proteins that regulate lymphocyte 
homeostasis) include, without limitation, those described in: Darzynkiewicz et al., Cytometry 
25 13:795-808, 1992; Gorczyca et al., Leukemia 7:659-670, 1993; Gorczyca et al., Cancer Research 
53:1945-1951, 1993; Itoh et al., Cell 66:233-243, 1991; Zacharchuk, Journal of Immunology 
145:4037-4045, 1990; Zamai et al., Cytometry 14:891-897, 1993; Gorczyca et al., International 
Journal of Oncology 1:639-648, 1992. 

Assays for proteins that influence early steps of T-cell commitment and development 
30 include, without limitation, those described in: Antica et al., Blood 84:1 1 1-1 17, 1994; Fine et al., 
Cellular Immunology 155:1 1 1-122, 1994; Galy et al., Blood 85:2770-2778, 1995; Toki et al., 
Proc. Nat. Acad Sci. USA 88:7548-7551, 1991. 

4.10.8 ACTIVIN/INHIBIN ACTIVITY 
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A polypeptide of the present invention may also exhibit activin- or inhibin-related 
activities. A polynucleotide of the invention may encode a polypeptide exhibiting such 
characteristics. Inhibins are characterized by their ability to inhibit the release of follicle 
stimulating hormone (FSH), while activins and are characterized by their ability to stimulate the 

5 release of follicle stimulating hormone (FSH). Thus, a polypeptide of the present invention, 
alone or in heterodimers with a member of the inhibin family, may be useful as a contraceptive 
based on the ability of inhibins to decrease fertility in female mammals and decrease 
spermatogenesis in male mammals. Administration of sufficient amounts of other inhibins can 
induce infertility in these mammals. Alternatively, the polypeptide of the invention, as a 
1 0 homodimer or as a heterodimer with other protein subunits of the inhibin group, may be useful as 
a fertility inducing therapeutic, based upon the ability of activin molecules in stimulating FSH 
release from cells of the anterior pituitary. See, for example, U.S. Pat. No. 4,798,885. A 
polypeptide of the invention may also be useful for advancement of the onset of fertility in 
sexually immature mammals, so as to increase the lifetime reproductive performance of domestic 

1 5 animals such as, but not limited to, cows, sheep and pigs. 

The activity of a polypeptide of the invention may, among other means, be measured by 

the following methods. 

Assays for activin/inhibin activity include, without limitation, those described in: Vale et 
al., Endocrinology 91:562-572, 1972; Ling et al., Nature 321:779-782, 1986; Vale et al., Nature 
20 321:776-779, 1986; Mason et al., Nature 318:659-663, 1985; Forage et al., Proc. Natl. Acad. Sci. 
USA 83:3091-3095, 1986. 

4.10.9 CHEMOTACTIC/CHEMOKINETIC ACTIVITY 

A polypeptide of the present invention may be involved in chemotactic or chemokinetic 
25 activity for mammalian cells, including, for example, monocytes, fibroblasts, neutrophils, 
T-cells, mast cells, eosinophils, epithelial and/or endothelial cells. A polynucleotide of the 
invention can encode a polypeptide exhibiting such attributes. Chemotactic and chemokinetic 
receptor activation can be used to mobilize or attract a desired cell population to a desired site of 
action. Chemotactic or chemokinetic compositions (e.g. proteins, antibodies, binding partners, or 
30 modulators of the invention) provide particular advantages in treatment of wounds and other 
trauma to tissues, as well as in treatment of localized infections. For example, attraction of 
lymphocytes, monocytes or neutrophils to tumors or sites of infection may result in improved 
immune responses against the tumor or infecting agent. 

A protein or peptide has chemotactic activity for a particular cell population if it can 
35 stimulate, directly or indirectly, the directed orientation or movement of such cell population. 
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Preferably, the protein or peptide has the ability to directly stimulate directed movement of cells. 
Whether a particular protein has chemotactic activity for a population of cells can be readily 
determined by employing such protein or peptide in any known assay for cell chemotaxis. 
Therapeutic compositions of the invention can be used in the following: 
Assays for chemotactic activity (which will identify proteins that induce or prevent 
chemotaxis) consist of assays that measure the ability of a protein to induce the migration of cells 
across a membrane as well as the ability of a protein to induce the adhesion of one cell 
population to another cell population. Suitable assays for movement and adhesion include, 
without limitation, those described in: Current Protocols in Immunology, Ed by J. E. Coligan, A. 
M. Kruisbeek, D. H. Marguiles, E. M. Shevach, W. Strober, Pub. Greene Publishing Associates 
and'Wiley-Interscience (Chapter 6.12, Measurement of alpha and beta Chemokines 
6.12.1-6.12.28; Taub et al. J. Clin. Invest 95:1370-1376, 1995; Lind et al. APMIS 103:140-146, 
1995- Muller et al Eur. J. Immunol. 25:1744-1748; Gruber et al. J. of Immunol. 152:5860-5867, 
1994; Johnston et al. J. of Immunol. 153:1762-1768, 1994. 



15 



4.1 0.10 HEMOSTATIC AND THROMBOLYTIC ACTIVITY 

A polypeptide of the invention may also be involved in hemostatis or thrombolysis or 
thrombosis. A polynucleotide of the invention can encode a polypeptide exhibiting such 
attributes. Compositions may be useful in treatment of various coagulation disorders (including 
20 hereditary disorders, such as hemophilias) or to enhance coagulation and other hemostatic events 
in treating wounds resulting from trauma, surgery or other causes. A composition of the 
invention may also be useful for dissolving or inhibiting formation of thromboses and for 
treatment and prevention of conditions resulting therefrom (such as, for example, infarction of 
cardiac and central nervous system vessels (e.g., stroke). 
25 Therapeutic compositions of the invention can be used in the following: 

Assay for hemostatic and thrombolytic activity include, without limitation, those 
described in: Linet et al., J- Clin. Pharmacol. 26:131-140, 1986; Burdick et al., Thrombosis Res. 
45:413-419, 1987; Humphrey et al., Fibrinolysis 5:71-79 (1991); Schaub, Prostaglandins 
35:467-474, 1988. 

30 

4.10.11 CANCER DIAGNOSIS AND THERAPY 

Polypeptides of the invention may be involved in cancer cell generation, proliferation or 
metastasis. Detection of the presence or amount of polynucleotides or polypeptides of the 
invention may be useful for the diagnosis and/or prognosis of one or more types of cancer. For 
35 example, the presence or increased expression of a polynucleotide/polypeptide of the invention 

52 



onsdocid: <wo. 



0153312A1_I_> 



WO 01/5331 2 PCT/US00/34263 

may indicate a hereditary risk of cancer, a precancerous condition, or an ongoing malignancy. 
Conversely, a defect in the gene or absence of the polypeptide may be associated with a cancer 
condition. Identification of single nucleotide polymorphisms associated with cancer or a 
predisposition to cancer may also be useful for diagnosis or prognosis. 
5 Cancer treatments promote tumor regression by inhibiting tumor cell proliferation, 

inhibiting angiogenesis (growth of new blood vessels that is necessary to support tumor growth) 
and/or prohibiting metastasis by reducing tumor cell motility or invasiveness. Therapeutic 
compositions of the invention may be effective in adult and pediatric oncology including in solid 
phase tumors/malignancies, locally advanced tumors, human soft tissue sarcomas, metastatic 
10 cancer, including lymphatic metastases, blood cell malignancies including multiple myeloma, 
acute and chronic leukernias, and lymphomas, head and neck cancers including mouth cancer, 
larynx cancer and thyroid cancer, lung cancers including small cell carcinoma and non-small cell 
cancers, breast cancers including small cell carcinoma and ductal carcinoma, gastrointestinal 
cancers including esophageal cancer, stomach cancer, colon cancer, colorectal cancer and polyps 
1 5 associated with colorectal neoplasia, pancreatic cancers, liver cancer, urologic cancers including 
bladder cancer and prostate cancer, malignancies of the female genital tract including ovarian 
carcinoma, uterine (including endometrial) cancers, and solid tumor in the ovarian follicle, 
kidney cancers including renal cell carcinoma, brain cancers including intrinsic brain tumors, 
neuroblastoma, astrocytic brain tumors, gliomas, metastatic tumor cell invasion in the central 
20 nervous system, bone cancers including osteomas, skin cancers including malignant melanoma, 
tumor progression of human skin keratinocytes, squamous cell carcinoma, basal cell carcinoma, 
hemangiopericytoma and Karposi's sarcoma. 

Polypeptides, polynucleotides, or modulators of polypeptides of the invention (including 
inhibitors and stimulators of the biological activity of the polypeptide of the invention) may be 
25 administered to treat cancer. Therapeutic compositions can be administered in therapeutically 
effective dosages alone or in combination with adjuvant cancer therapy such as surgery, 
chemotherapy, radiotherapy, thermotherapy, and laser therapy, and may provide a beneficial 
effect, e.g. reducing tumor size, slowing rate of tumor growth, inhibiting metastasis, or otherwise 
improving overall clinical condition, without necessarily eradicating the cancer. 
30 The composition can also be administered in therapeutically effective amounts as a 

portion of an anti-cancer cocktail. An anti-cancer cocktail is a mixture of the polypeptide or 
modulator of the invention with one or more anti-cancer drugs in addition to a pharmaceutical^ 
acceptable carrier for delivery. The use of anti-cancer cocktails as a cancer treatment is routine. 
Anti-cancer drugs that are well known in the art and can be used as a treatment in combination 
35 with the polypeptide or modulator of the invention include: Actinomycin D, Ammoglutethimide, 
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Asparaginase, Bleomycin, Busulfan, Carboplatin, Cannustine, Chlorambucu, Csplatm (cis- 
DDP) Cyclophosphamide, Cytarabine HCl (Cytosine arabinoside), Dacarbazine, Dachnomycn, 
Daunorubicin HCl, Doxorubicin HCl, Estramustine phosphate sodium, Etoposide (V16-213), 
Floxuridine, 5-Fluorouracil (5-Fu), Flutamide, Hydroxyurea (hydroxycarbamide), Ifosfamide, 
5 Interferon Alpha-2a, Interferon Alpha-2b, Leuprolide acetate (LHRH-releasing factor analog), 
Lomusune, Mechloremamine HCl (nitrogen mustard), Melphalan, Mercaptopurine, Mesna, 
Methotrexate (MTX), Mitomycin, Mitoxantrone HCl, Octreotide, Plicamycin, Procarbazme HCl, 
Streptozocin, Tamoxifen citrate, Thioguanine, Thiotepa, Vinblastine sulfate, Vincristine sulfate, 
Amsacrine, Azacitidine, Hexamethylmelamine, Interleukin-2, Mitoguazone, Pentostatm, 
10 Semustine, Teniposide, and Vindesine sulfate. 

In addition, therapeutic compositions of the invention may be used for prophylactic 
treatment of cancer. There are hereditary conditions and/or environmental situations (e.g. 
exposure to carcinogens) known in the art that predispose an individual to developing cancers. 
Under these circumstances, it may be beneficial to treat these individuals with therapeutically . 
15 effective doses of the polypeptide of the invention to reduce the risk of developing cancers. 

In vitro models can be used to determine the effective doses of the polypeptide of the 
invention as a potential cancer treatment. These in vitro models include proliferation assays of 
cultured tumor cells, growth of cultured tumor cells in soft agar (see Freshney, (1987) Culture of 
Animal Cells: A Manual of Basic Technique, Wily-Liss, New York, NY Ch 18 and Ch 21), 
20 tumor systems in nude mice as described in Giovanella et al., J. Natl. Can. Inst., 52: 921-30 

(1974) mobility and invasive potential of tumor cells in Boyden Chamber assays as descnbed m 
Pilkington et al., Anticancer Res., 17: 4107-9 (1997), and angiogenesis assays such as induction 
of vascularization of the chick chorioallantoic membrane or induction of vascular endothehal 
cell migration as described in Ribatta et al., Intl. J. Dev. Biol., 40: 1 1 89-97 (1999) and Li et al., 
25 Clin. Exp. Metastasis, 17:423-9 (1999), respectively. Suitable tumor cells lines are available, 
e.g. from American Type Tissue Culture Collection catalogs. 

4.10.12 RECEPTOR/LIGAND ACTIVITY 

A polypeptide of the present invention may also demonstrate activity as receptor, 
30 receptor ligand or inhibitor or agonist of receptor/ligand interactions. A polynucleotide of the 
invention can encode a polypeptide exhibiting such characteristics. Examples of such receptors 
and ligands include, without limitation, cytokine receptors and their ligands, receptor kinases and 
their ligands, receptor phosphatases and their ligands, receptors involved in cell-cell interacts 
and their ligands (including without limitation, cellular adhesion molecules (such as select**, 
35 integrins and their, ligands) and receptor/ligand pairs involved in antigen presentation, antigen 
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recognition and development of cellular and humoral immune responses. Receptors and ligands 
are also useful for screening of potential peptide or small molecule inhibitors of the relevant 
receptor/ligand interaction. A protein of the present invention (including, without limitation, 
fragments of receptors and ligands) may themselves be useful as inhibitors of receptor/ligand 
5 interactions. 

The activity of a polypeptide of the invention may, among other means, be measured by 

the following methods: 

Suitable assays for receptor-ligand activity include without limitation those described in: 

Current Protocols in Immunology, Ed by J. E. Coligan, A. M. Kruisbeek, D. H. Margulies, E. M. 
10 Shevach, W. Strober, Pub. Greene Publishing Associates and Wiley- Interscience (Chapter 7.28, 

Measurement of Cellular Adhesion under static conditions 7.28.1- 7.28.22), Takai et al. 3 Proc. 

Natl. Acad. Sci. USA 84:6864-6868, 1987; Bierer et ah, J. Exp. Med. 168:1 145-1 156, 1988; 

Rosenstein et al., J. Exp. Med. 169:149-160 1989; Stoltenborg et al., J. Irnrnunol. Methods 

175:59-68, 1994; Stitt et al., Cell 80:661-670, 1995. 
15 By way of example, the polypeptides of the invention may be used as a receptor for a 

ligand(s) thereby transmitting the biological activity of that ligand(s). Ligands may be identified 

through binding assays, affinity chromatography, dihybrid screening assays, BIAcore assays, gel 

overlay assays, or other methods known in the art. 

Studies characterizing drugs or proteins as agonist or antagonist or partial agonists or a 
20 partial antagonist require the use of other proteins as competing ligands. The polypeptides of the 

present invention or ligand(s) thereof may be labeled by being coupled to radioisotopes, 

colorimetric molecules or a toxin molecules by conventional methods. ("Guide to Protein 

Purification" Murray P. Deutscher (ed) Methods in Enzymology Vol. 182 (1990) Academic 

Press, Inc. San Diego). Examples of radioisotopes include, but are not limited to, tritium and 
25 carbon-14 . Examples of colorimetric molecules include, but are not limited to, fluorescent 

molecules such as fluorescarnine, or rhodamine or other colorimetric molecules. Examples of 

toxins include, but are not limited, to ricin. 

4.10.13 DRUG SCREENING 

30 This invention is particularly useful for screening chemical compounds by using the 

novel polypeptides or binding fragments thereof in any of a variety of drug screening techniques. 
The polypeptides or fragments employed in such a test may either be free in solution, affixed to a 
solid support, borne on a cell surface or located intracellularly. One method of drug screening 
utilizes eukaryotic or prokaryotic host cells which are stably transformed with recombinant 

35 nucleic acids expressing the polypeptide or a fragment thereof. Drugs are screened against such 
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transformed cells in competitive binding assays. Such cells, either in viable or fixed form, can 
be used for standard binding assays. One may measure, for example, the formation of complexes 
between polypeptides of the invention or fragments and the agent being tested or examine the 
diminution in complex formation between the novel polypeptides and an appropriate cell line, 

5 which are well known in the art. 

Sources for test compounds that may be screened for ability to bind to or modulate (i.e., 
increase or decrease) the activity of polypeptides of the invention include (1) inorganic and 
organic chemical libraries, (2) natural product libraries, and (3) combinatorial libraries 
comprised of either random or mimetic peptides, oligonucleotides or organic molecules. 
1 0 Chemical libraries may be readily synthesized or purchased from a number of 

commercial sources, and may include structural analogs of known compounds or compounds 
that are identified as "hits" or "leads' 5 via natural product screening. 

The sources of natural product libraries are microorganisms (including bacteria and 
fungi), animals, plants or other vegetation, or marine organisms, and libraries of mixtures for 
1 5 screening may be created by: (1) fermentation and extraction of broths from soil, plant or marine 
microorganisms or (2) extraction of the organisms themselves. Natural product libraries include 
polyketides, non-ribosomal peptides, and (non-naturally occurring) variants thereof. For a 

review, see Science 282:63-68 (1998). 

Combinatorial libraries are composed of large numbers of peptides, oligonucleotides or 
20 organic compounds and can be readily prepared by traditional automated synthesis methods, 
PCR, cloning or proprietary synthetic methods. Of particular interest are peptide and 
oligonucleotide combinatorial libraries. Still other libraries of interest include peptide, protein, 
peptidomimetic, multiparallel synthetic collection, recombinatorial, and polypeptide libraries. 
For a review of combinatorial chemistry and libraries created therefrom, see Myers, Curr. Opin. 
25 BiotechnoL 8:701-707 (1997). For reviews and examples of peptidomimetic libraries, see 
Al-Obeidi et al., Mol Biotechnol, 9(3):205-23 (1998); Hruby et al., Curr Opin Chem Biol, 
1(1):1 14-19 (1997); Dorner et al., Bioorg Med Chem, 4(5):709-15 (1996) (alkylated dipeptides). 

Identification of modulators through use of the various libraries described herein permits 
modification of the candidate "hit" (or "lead") to optimize the capacity of the "hit" to bind a 
30 polypeptide of the invention. The molecules identified in the binding assay are then tested for 

antagonist or agonist activity in in vivo tissue culture or animal models that are well known in the 
art. In brief, the molecules are titrated into a plurality of cell cultures or animals and then tested 
for either cell/animal death or prolonged survival of the animal/cells. 

The binding molecules thus identified may be complexed with toxins, e.g., ricin or 
3 5 cholera, or with other compounds that are toxic to cells such as radioisotopes. The toxin-binding 
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molecule complex is then targeted to a tumor or other cell by the specificity of the binding 
molecule for a polypeptide of the invention. Alternatively, the binding molecules may be 
coraplexed with imaging agents for targeting and imaging purposes. 

5 4.10.14 ASSAY FOR RECEPTOR ACTIVITY 

The invention also provides methods to detect specific binding of a polypeptide e.g. a 
ligand or a receptor. The art provides numerous assays particularly useful for identifying 
previously unknown binding partners for receptor polypeptides of the invention. For example, 
expression cloning using mammalian or bacterial cells, or dihybrid screening assays can be used 

10 to identify polynucleotides encoding binding partners. As another example, affinity 

chromatography with the appropriate immobilized polypeptide of the invention can be used to 
isolate polypeptides that recognize and bind polypeptides of the invention. There are a number 
of different libraries used for the identification of compounds, and in particular small molecules, 
that modulate (i.e., increase or decrease) biological activity of a polypeptide of the invention. 

1 5 Ligands for receptor polypeptides of the invention can also be identified by adding exogenous 
ligands, or cocktails of ligands to two cells populations that are genetically identical except for 
the expression of the receptor of the invention: one cell population expresses the receptor of the 
invention whereas the other does not. The response of the two cell populations to the addition of 
ligands(s) are then compared. Alternatively, an expression library can be co-expressed with the 

20 polypeptide of the invention in cells and assayed for an autocrine response to identify potential 
ligand(s). As still another example, BIAcore assays, gel overlay assays, or other methods known 
in the art can be used to identify binding partner polypeptides, including, (1) organic and 
inorganic chemical libraries, (2) natural product libraries, and (3) combinatorial libraries 
comprised of random peptides, oligonucleotides or organic molecules. 

25 The role of downstream intracellular signaling molecules in the signaling cascade of the 

polypeptide of the invention can be determined. For example, a chimeric protein in which the 
cytoplasmic domain of the polypeptide of the invention is fused to the extracellular portion of a 
protein, whose ligand has been identified, is produced in a host cell. The cell is then incubated 
with the ligand specific for the extracellular portion of the chimeric protein, thereby activating 

30 the chimeric receptor. Known downstream proteins involved in intracellular signaling can then 
be assayed for expected modifications i.e. phosphorylation. Other methods known to those in the 
art can also be used to identify signaling molecules involved in receptor activity. 

4.10.15 ANTI-INFLATVIMATORY ACTIVITY 
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Compositions of the present invention may also exhibit anti-inflammatory activity. The 
anti-inflammatory activity may be achieved by providing a stimulus to cells involved in the 
inflammatory response, by inhibiting or promoting cell-cell interactions (such as, for example, 
cell adhesion), by inhibiting or promoting chemotaxis of cells involved in the inflammatory 
5 process, inhibiting or promoting cell extravasation, or by stimulating or suppressing production 
of other factors which more directly inhibit or promote an inflammatory response. Compositions 
with such activities can be used to treat inflammatory conditions including chronic or acute 
conditions), including without limitation intimation associated with infection (such as septic 
shock, sepsis or systemic inflammatory response syndrome (SIRS)), ischemia-reperfusion injury, 
10 endotoxin lethality, arthritis, complement-mediated hyperacute rejection, nephritis, cytokine or 
chemokine-induced lung injury, inflammatory bowel disease, Crohn's disease or resulting from 
over production of cytokines such as TNF or IL-1. Compositions of the invention may also be 
useful to treat anaphylaxis and hypersensitivity to an antigenic substance or material. 
Compositions of this invention may be utilized to prevent or treat conditions such as, but not 
1 5 limited to, sepsis, acute pancreatitis, endotoxin shock, cytokine induced shock, rheumatoid 

arthritis, chronic inflammatory arthritis, pancreatic cell damage from diabetes mellitus type 1, 
graft versus host disease, inflammatory bowel disease, inflamation associated with pulmonary 
disease, other autoimmune disease or inflammatory disease, an antiproliferative agent such as for 
acute or chronic myogenous leukemia or in the prevention of premature labor secondary to 
20 intrauterine infections. 



25 



30 



4.10.16 LEUKEMIAS 

Leukemias and related disorders may be treated or prevented by administration of a 
therapeutic that promotes or inhibits function of the polynucleotides and/or polypeptides of the 
invention. Such leukemias and related disorders include but are not limited to acute leukemia, 
acute lymphocytic leukemia, acute myelocytic leukemia, myeloblast*, promyelocyte, 
myelomonocytic, monocytic, erythroleukemia, chronic leukemia, chronic myelocytic 
(granulocytic) leukemia and chronic lymphocytic leukemia (for a review of such disorders, see 
Fishroan et at, 1985, Medicine, 2d Ed., J.B. Lippincott Co., Philadelphia). 



4.10.17 NERVOUS SYSTEM DISORDERS 

Nervous system disorders, involving cell types which can be tested for efficacy of 
intervention with compounds that modulate the activity of the polynucleotides and/or 
polypeptides of the invention, and which can be treated upon thus observing an indication of 
35 therapeutic utility, include but are not limited to nervous system injuries, and diseases or 
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disorders which result in either a disconnection of axons, a diminution or degeneration of 
neurons, or demyelination. Nervous system lesions which may be treated in a patient (including 
human and non-human mammalian patients) according to the invention include but are not 
limited to the following lesions of either the central (including spinal cord, brain) or peripheral 
5 nervous systems: 

(i) traumatic lesions, including lesions caused by physical injury or associated with 
surgery, for example, lesions which sever a portion of the nervous system, or compression 
injuries; 

(ii) ischemic lesions, in which a lack of oxygen in a portion of the nervous system 
1 0 results in neuronal injury or death, including cerebral infarction or ischemia, or spinal cord 

infarction or ischemia; 

(iii) infectious lesions, in which a portion of the nervous system is destroyed or injured 
as a result of infection, for example, by an abscess or associated with infection by human 
immunodeficiency virus, herpes zoster, or herpes simplex virus or with Lyme disease, 

15 tuberculosis, syphilis; 

(iv) degenerative lesions, in which a portion of the nervous system is destroyed or 
injured as a result of a degenerative process including but not limited to degeneration associated 
with Parkinson's disease, Alzheimer's disease, Huntington's chorea, or amyotrophic lateral 
sclerosis; 

20 (v) lesions associated with nutritional diseases or disorders, in which a portion of the 

nervous system is destroyed or injured by a nutritional disorder or disorder of metabolism 
including but not limited to, vitamin B 12 deficiency, folic acid deficiency, Wernicke disease, 
tobacco-alcohol amblyopia, Marchiafava-Bignami disease (primary degeneration of the corpus 
callosum), and alcoholic cerebellar degeneration; 

25 (vi) neurological lesions associated with systemic diseases including but not limited to 

diabetes (diabetic neuropathy, Bell's palsy), systemic lupus erythematosus, carcinoma, or 
sarcoidosis; 

(vii) lesions caused by toxic substances including alcohol, lead, or particular 
neurotoxins; and 

30 (viii) demyelinated lesions in which a portion of the nervous system is destroyed or 

injured by a demyelinating disease including but not limited to multiple sclerosis, human 
immunodeficiency virus-associated myelopathy, transverse myelopathy or various etiologies, 
progressive multifocal leukoencephalopathy, and central pontine myelinolysis. 

Therapeutics which are useful according to the invention for treatment of a nervous 

35 system disorder may be selected by testing for biological activity in promoting the survival or 

59 

r- • * 

BNSDCCID: <WO 01533t2Ai_»_> 



PCT/US00/34263 

WO 01/53312 , ± . - 

differentiation of neurons. For example, and not by way of limitation, therapeutics winch ehcrt 
any of the following effects may be useful according to the invention: 

(i) increased survival time of neurons in culture; 

(ii) increased sprouting of neurons in culture or in vivo; 

(iii) increased production of a neuron-associated molecule in culture or in vivo, e.g., 
choline acetyltransferase or acetylcholinesterase with respect to motor neurons; or 

(iv) decreased symptoms of neuron dysfunction in vivo. 

Such effects may be measured by any method known in the art. In preferred, 
non-limiting embodiments, increased survival of neurons may be measured by the method set 
) forth in Arakawa et al. (1990, J. Neurosci. 10:3507-3515); increased sprouting of neurons may 
be detected by methods set forth in Pestronk et al. (1980, Exp. Neurol. 70:65-82) or Brown et al. 
(1981 Ann. Rev. Neurosci. 4:17-42); increased production of neuron-associated molecules may 
be measured by bioassay, enzymatic assay, antibody binding, Northern blot assay, etc., 
depending on the molecule to be measured; and motor neuron dysfunction may be measured by 
5 assessing the physical manifestation of motor neuron disorder, e.g. , weakness, motor neuron 
conduction velocity, or functional disability. 

In specific embodiments, motor neuron disorders that may be treated according to the 
invention include but are not limited to disorders such as infarction, infection, exposure to toxm, 
trauma, surgical damage, degenerative disease or malignancy that may affect motor neurons as 
20 well as other components of the nervous system, as well as disorders that selectively affect 

neurons such as amyotrophic lateral sclerosis, and including but not limited to progress^ spmal 
muscular atrophy, progressive bulbar palsy, primary lateral sclerosis, infantile and juvemle 
muscular atrophy, progressive bulbar paralysis of childhood (Fazio-Londe syndrome), 
poliomyelitis and the post polio syndrome, and Hereditary Motorsensory Neuropathy 
25 (Charcot-Marie-Tooth Disease). 

4 10.18 OTHER ACTIVITIES 

A polypeptide of the invention may also exhibit one or more of the following additional 
activities or effects: inhibiting the growth, infection or function of, or killing, infectious agents, 

30 including, without limitation, bacteria, viruses, fungi and other parasites; effecting (suppressmg 
or enhancing) bodily characteristics, including, without limitation, height, weight, han: color, eye 
color, skin, fat to lean ratio or other tissue pigmentation, or organ or body part size or shape 
(such as, for example, breast augmentation or diminution, change in bone form or shape); 
effecting biorhythxns or circadian cycles or rhythms; effecting the fertility of male or female 

35 subjects; effecting the metabolism, catabolism, anabolism, processing, uulizafcon, storage or 
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elimination of dietary fat, lipid, protein, carbohydrate, vitamins, minerals, co-factors or other 
nutritional factors or component(s); effecting behavioral characteristics, including, without 
limitation, appetite, libido, stress, cognition (including cognitive disorders), depression 
(including depressive disorders) and violent behaviors; providing analgesic effects or other pain 
5 reducing effects; promoting differentiation and growth of embryonic stem cells in lineages other 
than hematopoietic lineages; hormonal or endocrine activity; in the case of enzymes, correcting 
deficiencies of the enzyme and treating deficiency-related diseases; treatment of 
hyperproliferative disorders (such as, for example, psoriasis); immunoglobulin-like activity (such 
as, for example, the ability to bind antigens or complement); and the ability to act as an antigen 
10 in a vaccine composition to raise an immune response against such protein or another material or 
entity which is cross-reactive with such protein. 

4.1 0.19 IDENTIFICATION OF POLYMORPHISMS 

The demonstration of polymorphisms makes possible the identification of such 
1 5 polymorphisms in human subjects and the pharmacogenetic use of this information for diagnosis 
and treatment. Such polymorphisms may be associated with, e.g., differential predisposition or 
susceptibility to various disease states (such as disorders involving inflammation or immune 
response) or a differential response to drug administration, and this genetic information can be 
used to tailor preventive or therapeutic treatment appropriately. For example, the existence of a 
20 polymorphism associated with a predisposition to inflammation or autoimmune disease makes 
possible the diagnosis of this condition in humans by identifying the presence of the 
polymorphism. 

Polymorphisms can be identified in a variety of ways known in the art which all 
generally involve obtaining a sample from a patient, analyzing DNA from the sample, optionally 

25 involving isolation or amplification of the DNA, and identifying the presence of the 

polymorphism in the DNA. For example, PCR may be used to amplify an appropriate fragment 
of genomic DNA which may then be sequenced. Alternatively, the DNA may be subjected to 
allele-specific oligonucleotide hybridization (in which appropriate oligonucleotides are 
hybridized to the DNA under conditions permitting detection of a single base mismatch) or to a 

30 single nucleotide extension assay (in which an oligonucleotide that hybridizes immediately 

adjacent to the position of the polymorphism is extended with one or more labeled nucleotides). 
In addition, traditional restriction fragment length polymorphism analysis (using restriction 
enzymes that provide differential digestion of the genomic DNA depending on the presence or 
absence of the polymorphism) may be performed. Arrays with nucleotide sequences of the 

35 present invention can be used to detect polymorphisms. The array can comprise modified 
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nucleotide sequences of the present invention in order to detect the nucleotide sequences of the 
present invention. In the alternative, any one of the nucleotide sequences of the present 
invention can be placed on the array to detect changes from those sequences. 

Alternatively a polymorphism resulting in a change in the amino acid sequence could 
also be detected by detecting a corresponding change in amino acid sequence of the protein, e.g., 
by an antibody specific to the variant sequence. 

4.10.20 ARTHRITIS AND INFLAMMATION 

The immunosuppressive effects of the compositions of the invention against rheumatoid 
arthritis is determined in an experimental animal model system. The experimental model system 
is adjuvant induced arthritis in rats, and the protocol is described by J. Holoshitz, et at., 1983, 
Science, 219:56, or by B. Waksman et al., 1963, Int. Arch. Allergy Appl. Immunol., 23:129. 
Induction of the disease can be caused by a single injection, generally intradermally, of a 
suspension of killed Mycobacterium tuberculosis in complete Freund's adjuvant (CFA). The 
route of injection can vary, but rats may be injected at the base of the tail with an adjuvant 
mixture. The polypeptide is administered in phosphate buffered solution (PBS) at a dose of about 
1-5 mg/kg. The control consists of administering PBS only. 

The procedure for testing the effects of the test compound would consist of intradermally 
injecting killed Mycobacterium tuberculosis in CFA followed by immediately administering the 
test compound and subsequent treatment every other day until day 24. At 14, 15, 18, 20, 22, and 
24 days after injection of Mycobacterium CFA, an overall arthritis score may be obtained as 
described by J. Holoskitz above. An analysis of the data would reveal that the test compound 
would have a dramatic affect on the swelling of the joints as measured by a decrease of the 
arthritis score. 

4.11 THERAPEUTIC METHODS 

The compositions (including polypeptide fragments, analogs, variants and antibodies or 
other binding partners or modulators including antisense polynucleotides) of the invention have 
numerous applications in a variety of therapeutic methods. Examples of therapeutic applications 
include, but are not limited to, those exemplified herein. 



4.11.1 EXAMPLE 

One embodiment of the invention is the administration of an effective amount of the 
polypeptides or other composition of the invention to individuals affected by a disease or 
disorder that can be modulated by regulating the peptides of the invention. While the mode of 
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administration is not particularly important, parenteral administration is preferred. An 
exemplary mode of administration is to deliver an intravenous bolus. The dosage of the 
polypeptides or other composition of the invention will normally be determined by the 
prescribing physician. It is to be expected that the dosage will vary according to the age, weight, 
5 condition and response of the individual patient. Typically, the amount of polypeptide 

adrninistered per dose will be in the range of about O.Olug/kg to 100 mg/kg of body weight, with 
the preferred dose being about 0.1 jig/kg to 10 mg/kg of patient body weight. For parenteral 
administration, polypeptides of the invention will be formulated in an injectable form combined 
with a pharmaceutically acceptable parenteral vehicle. Such vehicles are well known in the art 
10 and examples include water, saline, Ringer's solution, dextrose solution, and solutions consisting 
of small amounts of the human serum albumin. The vehicle may contain minor amounts of 
additives that maintain the isotonicity and stability of the polypeptide or other active ingredient. 
The preparation of such solutions is within the skill of the art. 

1 5 4.12 PHARMACEUTICAL FORMULATIONS AND ROUTES OF 
ADMINISTRATION 

A protein or other composition of the present invention (from whatever source derived, 
including without limitation from recombinant and non-recombinant sources and including 
antibodies and other binding partners of the polypeptides of the invention) may be administered 

20 to a patient in need, by itself, or in pharmaceutical compositions where it is mixed with suitable 
carriers or excipient(s) at doses to treat or ameliorate a variety of disorders. Such a composition 
may optionally contain (in addition to protein or other active ingredient and a carrier) diluents, 
fillers, salts, buffers, stabilizers, solubilizers, and other materials well known in the art. The term 
"pharmaceutically acceptable" means a non-toxic material that does not interfere with the 

25 effectiveness of the biological activity of the active ingredient(s). The characteristics of the 
carrier will depend on the route of administration. The pharmaceutical composition of the 
invention may also contain cytokines, lymphokines, or other hematopoietic factors such as 
M-CSF, GM-CSF, TNF, IL-1, IL-2, IL-3, IL-4, IL-5, IL-6, IL-7, IL-8, IL-9, IL-10, IL-1 1, IL-12, 
IL-13, IL-14, IL-15, IFN, TWO, TNF1, TNF2, G-CSF, Meg-CSF, thrombopoietin, stem cell 

30 factor, and erythropoietin. In further compositions, proteins of the invention may be combined 
with other agents beneficial to the treatment of the disease or disorder in question. These agents 
include various growth factors such as epidermal growth factor (EGF), platelet-derived growth 
factor (PDGF), transforming growth factors (TGF-a and TGF-P), insulin-like growth factor 
(IGF), as well as cytokines described herein. 
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The pharmaceutical composition may further contain other agents which either enhance 
the activity of the protein or other active ingredient or complement its activity or use in 
treatment. Such additional factors and/or agents may be included in the pharmaceutical 
composition to produce a synergistic effect with protein or other active ingredient of the 
5 invention, or to minimize side effects. Conversely, protein or other active ingredient of the 
present invention may be included in formulations of the particular clotting factor, cytokine, 
lymphokine, other hematopoietic factor, thrombolytic or anti-thrombotic factor, or anti- 
inflammatory agent to minimize side effects of the clotting factor, cytokine, lymphokine, other 
hematopoietic factor, thrombolytic or anti-thrombotic factor, or anti-inflammatory agent (such as 
1 0 IL-lRa, IL-1 Hy 1, IL-1 Hy2, anti-TNF, corticosteroids, immunosuppressive agents). A protein 
of the present invention may be active in multimers (e.g., heterodimers or homodimers) or 
complexes with itself or other proteins. As a result, pharmaceutical compositions of the 
invention may comprise a protein of the invention in such multimeric or complexed form. 

As an alternative to being included in a pharmaceutical composition of the invention 
1 5 including a first protein, a second protein or a therapeutic agent may be concurrently 

administered with the first protein (e.g., at the same time, or at differing times provided that 
therapeutic concentrations of the combination of agents is achieved at the treatment site). 
Techniques for formulation and administration of the compounds of the instant application may 
be found in "Remington's Pharmaceutical Sciences," Mack Publishing Co., Easton, PA, latest 
20 edition. A therapeutically effective dose further refers to that amount of the compound sufficient 
to result in amelioration of symptoms, e.g., treatment, healing, prevention or amelioration of the 
relevant medical condition, or an increase in rate of treatment, healing, prevention or 
amelioration of such conditions. When applied to an individual active ingredient, administered 
alone, a therapeutically effective dose refers to that ingredient alone. When applied to a 
25 combination, a therapeutically effective dose refers to combined amounts of the active 

ingredients that result in the therapeutic effect, whether administered in combination, serially or 
simultaneously. 

In practicing the method of treatment or use of the present invention, a therapeutically 
effective amount of protein or other active ingredient of the present invention is administered to 

30 a mammal having a condition to be treated. Protein or other active ingredient of the present 

invention may be administered in accordance with the method of the invention either alone or in 
combination with other therapies such as treatments employing cytokines, lymphokines or other 
hematopoietic factors. When co- administered with one or more cytokines, lymphokines or other 
hematopoietic factors, protein or other active ingredient of the present invention may be 

35 adrninistered either simultaneously with the cytokine(s), lymphokine(s), other hematopoietic 
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factor(s), thrombolytic or anti-thrombotic factors, or sequentially. If administered sequentially, 

the attending physician will decide on the appropriate sequence of adrninisteririg protein or other 

active ingredient of the present invention in combination with cytokine(s), lymphokine(s), other 

hematopoietic factor(s), thrombolytic or anti-thrombotic factors. 

5 

4.12.1 ROUTES OF ADMINISTRATION 

Suitable routes of administration may, for example, include oral, rectal, transmucosal, or 
intestinal administration; parenteral delivery, including intramuscular, subcutaneous, 
intramedullary injections, as well as intrathecal, direct intraventricular, intravenous, 

10 intraperitoneal, intranasal, or intraocular injections. Administration of protein or other active 
ingredient of the present invention used in the pharmaceutical composition or to practice the 
method of the present invention can be carried out in a variety of conventional ways, such as oral 
ingestion, inhalation, topical application or cutaneous, subcutaneous, intraperitoneal, parenteral 
or intravenous injection. Intravenous administration to the patient is preferred. 

15 Alternately, one may administer the compound in a local rather than systemic manner, for 

example, via injection of the compound directly into a arthritic joints or in fibrotic tissue, often in 
a depot or sustained release formulation. In order to prevent the scarring process frequently 
occurring as complication of glaucoma surgery, the compounds may be administered topically, 
for example, as eye drops. Furthermore, one may administer the drug in a targeted drug delivery 

20 system, for example, in a liposome coated with a specific antibody, targeting, for example, 
arthritic or fibrotic tissue. The liposomes will be targeted to and taken up selectively by the 
afflicted tissue. 

The polypeptides of the invention are administered by any route that delivers an effective 
dosage to the desired site of action. The determination of a suitable route of administration and 
25 an effective dosage for a particular indication is within the level of skill in the art. Preferably for 
wound treatment, one administers the therapeutic compound directly to the site. Suitable dosage 
ranges for the polypeptides of the invention can be extrapolated from these dosages or from 
similar studies in appropriate animal models. Dosages can then be adjusted as necessary by the 
clinician.to provide maximal therapeutic benefit. 

30 

4.12.2 COMPOSITIONS/FORMULATIONS 

Pharmaceutical compositions for use in accordance with the present invention thus may 
be formulated in a conventional manner using one or more physiologically acceptable carriers 
comprising excipients and auxiliaries which facilitate processing of the active compounds into 
35 preparations which can be used pharmaceutically. These pharmaceutical compositions may be 
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manufactured in a manner that is itself known, e.g., by means of conventional mixing, 
dissolving, granulating, dragee-making, levigating, emulsifying, encapsulating, entrapping or 
lyophilizing processes. Proper formulation is dependent upon the route of administration chosen. 
When a therapeutically effective amount of protein or other active ingredient of the present 
5 invention is administered orally, protein or other active ingredient of the present invention will 
be in the form of a tablet, capsule, powder, solution or elixir. When administered in tablet form, 
the pharmaceutical composition of the invention may additionally contain a solid earner such as 
a gelatin or an adjuvant. The tablet, capsule, and powder contain from about 5 to 95% protein or 
other active ingredient of the present invention, and preferably from about 25 to 90% protein or 
1 0 other active ingredient of the present invention. When administered in liquid form, a liquid 
carrier such as water, petroleum, oils of animal or plant origin such as peanut oil, mineral oil, 
soybean oil, or sesame oil, or synthetic oils may be added. The liquid form of the 
pharmaceutical composition may further contain physiological saline solution, dextrose or other 
saccharide solution, or glycols such as ethylene glycol, propylene glycol or polyethylene glycol. 
15 When administered in liquid form, the pharmaceutical composition contains from about 0.5 to 
90% by weight of protein or other active ingredient of the present invention, and preferably from 
about 1 to 50% protein or other active ingredient of the present invention. 

When a therapeutically effective amount of protein or other active ingredient of the 
present invention is administered by intravenous, cutaneous or subcutaneous injection, protein or 
20 other active ingredient of the present invention will be in the form of a pyrogen-free, parenteral^ 
acceptable aqueous solution. The preparation of such parenterally acceptable protein or other 
active ingredient solutions, having due regard to pH, isotonicity, stability, and the like, is within 
the skill in the art. A preferred pharmaceutical composition for intravenous, cutaneous, or 
subcutaneous injection should contain, in addition to protein or other active ingredient of the 
25 present invention, an isotonic vehicle such as Sodium Chloride Injection, Ringer's Injection, 
Dextrose Injection, Dextrose and Sodium Chloride Injection, Lactated Ringer's Injection, or 
other vehicle as known in the art. The pharmaceutical composition of the present invention may 
also contain stabilizers, preservatives, buffer, antioxidants, or other additives known to those of 
skill in the art. For injection, the agents of the invention may be formulated in aqueous solutions, 
30 preferably in physiologically compatible buffers such as Hanks's solution, Ringer's solution, or 
physiological saline buffer. For transmucosal administration, penetrants appropriate to the 
barrier to be permeated are used in the formulation. Such penetrants are generally known in the 
art. 

For oral administration, the compounds can be formulated readily by combining the 
35 active compounds with pharmaceutical* acceptable carriers well known in the art. Such carriers 
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enable the compounds of the invention to be formulated as tablets, pills, dragees, capsules, 
liquids, gels, syrups, slurries, suspensions and the like, for oral ingestion by a patient to be 
treated. Pharmaceutical preparations for oral use can be obtained from a solid excipient, 
optionally grinding a resulting mixture, and processing the mixture of granules, after adding 
5 suitable auxiliaries, if desired, to obtain tablets or dragee cores. Suitable excipients are, in 
particular, fillers such as sugars, including lactose, sucrose, mannitol, or sorbitol; cellulose 
preparations such as, for example, maize starch, wheat starch, rice starch, potato starch, gelatin, 
gum tragacanth, methyl cellulose, hydroxypropylmethyl-cellulose, sodium 
carboxymethylcellulose, and/or polyvinylpyrrolidone (PVP). If desired, disintegrating agents 

10 may be added, such as the cross-linked polyvinyl pyrrolidone, agar, or alginic acid or a salt 
thereof such as sodium alginate. Dragee cores are provided with suitable coatings. For this 
purpose, concentrated sugar solutions may be used, which may optionally contain gum arabic, 
talc, polyvinyl pyrrolidone, carbopol gel, polyethylene glycol, and/or titanium dioxide, lacquer 
solutions, and suitable organic solvents or solvent mixtures. Dyestuffs or pigments may be 

1 5 added to the tablets or dragee coatings for identification or to characterize different combinations 
of active compound doses. 

Pharmaceutical preparations which can be used orally include push-fit capsules made of 
gelatin, as well as soft, sealed capsules made of gelatin and a plasticizer, such as glycerol or 
sorbitol. The push-fit capsules can contain the active ingredients in admixture with filler such as 

20 lactose, binders such as starches, and/or lubricants such as talc or magnesium stearate and, 

optionally, stabilizers. In soft capsules, the active compounds may be dissolved or suspended in 
suitable liquids, such as fatty oils, liquid paraffin, or liquid polyethylene glycols. In addition, 
stabilizers may be added. All formulations for oral administration should be in dosages suitable 
for such administration. For buccal administration, the compositions may take the form of 

25 tablets or lozenges formulated in conventional manner. 

For administration by inhalation, the compounds for use according to the present 
invention are conveniently delivered in the form of an aerosol spray presentation from 
pressurized packs or a nebuliser, with the use of a suitable propellant, e.g., 
dichlorodifluoromethane, trichlorofluoromethane, dichlorotetrafluoroethane, carbon dioxide or 

30 other suitable gas. In the case of a pressurized aerosol, the dosage unit may be determined by 

providing a valve to deliver a metered amount. Capsules and cartridges of, e.g., gelatin for use in 
an inhaler or insufflator may be formulated containing a powder mix of the compound and a 
suitable powder base such as lactose or starch. The compounds may be formulated for parenteral 
administration by injection, e.g., by bolus injection or continuous infusion. Formulations for 
35 injection may be presented in unit dosage form, e.g., in ampules or in multi-dose containers, with 
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an added preservative. The compositions may take such forms as suspensions, solutions or 
emulsions in oily or aqueous vehicles, and may contain formulatory agents such as suspending, 

stabilizing and/or dispersing agents. 

Pharmaceutical formulations for parenteral administration include aqueous solutions of 
5 the active compounds in water-soluble form. Additionally, suspensions of the active compounds 
may be prepared as appropriate oily injection suspensions. Suitable lipophilic solvents or 
vehicles include fatty oils such as sesame oil, or synthetic fatty acid esters, such as ethyl oleate or 
triglycerides, or liposomes. Aqueous injection suspensions may contain substances which 
increase the viscosity of the suspension, such as sodium carboxymethyl cellulose, sorbitol, or 
10 dextran. Optionally, the suspension may also contain suitable stabilizers or agents which 
increase the solubility of the compounds to allow for the preparation of highly concentrated 
solutions. Alternatively, the active ingredient may be in powder form for constitution with a 
suitable vehicle, e.g., sterile pyrogen-free water, before use. 

The compounds may also be formulated in rectal compositions such as suppositories or 
1 5 retention enemas, e.g. , containing conventional suppository bases such as cocoa butter or other 
glycerides. In addition to the formulations described previously, the compounds may also be 
formulated as a depot preparation. Such long acting formulations may be administered by 
implantation (for example subcutaneously or intramuscularly) or by intramuscular injection. 
Thus, for example, the compounds may be formulated with suitable polymeric or hydrophobic 
20 materials (for example as an emulsion in an acceptable oil) or ion exchange resins, or as 
sparingly soluble derivatives, for example, as a sparingly soluble salt. 

A pharmaceutical carrier for the hydrophobic compounds of the invention is a co-solvent 
system comprising benzyl alcohol, a nonpolar surfactant, a water-miscible organic polymer, and 
an aqueous phase. The co-solvent system may be the VPD co-solvent system. VPD is a solution 
25 of 3% w/v benzyl alcohol, 8% w/v of the nonpolar surfactant polysorbate 80, and 65% w/v 

polyethylene glycol 300, made up to volume in absolute ethanoi. The VPD co-solvent system 
(VPD:5W) consists of VPD diluted 1:1 with a 5% dextrose in water solution. This co-solvent 
system dissolves hydrophobic compounds well, and itself produces low toxicity upon systemic 
administration. Naturally, the proportions of a co-solvent system may be varied considerably 
30 without destroying its solubility and toxicity characteristics. Furthermore, the identity of the 

co-solvent components may be varied: for example, other low-toxicity nonpolar surfactants may 
be used instead of polysorbate 80; the fraction size of polyethylene glycol may be varied; other 
biocompatible polymers may replace polyethylene glycol, e.g. polyvinyl pyrrolidone; and other 
sugars or polysaccharides may substitute for dextrose. Alternatively, other delivery systems for 
35 hydrophobic pharmaceutical compounds may be employed. Liposomes and emulsions are well 
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known examples of delivery vehicles or carriers for hydrophobic drugs. Certain organic solvents 
such as dimethylsulfoxide also may be employed, although usually at the cost of greater toxicity. 
Additionally, the compounds may be delivered using a sustained-release system, such as 
semipermeable matrices of solid hydrophobic polymers containing the therapeutic agent 
5 Various types of sustained-release materials have been established and are well known by those 
skilled in the art. Sustained-release capsules may, depending on their chemical nature, release the 
compounds for a few weeks up to over 100 days. Depending on the chemical nature and the 
biological stability of the therapeutic reagent, additional strategies for protein or other active 
ingredient stabilization may be employed. 
10 The pharmaceutical compositions also may comprise suitable solid or gel phase carriers 

or excipients. Examples of such carriers or excipients include but are not limited to calcium 
carbonate, calcium phosphate, various sugars, starches, cellulose derivatives, gelatin, and 
polymers such as polyethylene glycols. Many of the active ingredients of the invention may be 
provided as salts with pharmaceutically compatible counter ions. Such pharmaceutically 
1 5 acceptable base addition salts are those salts which retain the biological effectiveness and 

properties of the free acids and which are obtained by reaction with inorganic or organic bases 
such as sodium hydroxide, magnesium hydroxide, ammonia, trialkylamine, dialkylamine, 
monoalkylamine, dibasic amino acids, sodium acetate, potassium benzoate, triethanol amine and 
the like. 

20 The pharmaceutical composition of the invention may be in the form of a complex of the 

protein(s) or other active ingredient(s) of present invention along with protein or peptide 
antigens. The protein and/or peptide antigen will deliver a stimulatory signal to both B and T 
lymphocytes. B lymphocytes will respond to antigen through their surface immunoglobulin 
receptor. T lymphocytes will respond to antigen through the T cell receptor (TCR) following 

25 presentation of the antigen by MHC proteins. MHC and structurally related proteins including 
those encoded by class I and class II MHC genes on host cells will serve to present the peptide 
antigen(s) to T lymphocytes. The antigen components could also be supplied as purified 
MHC-peptide complexes alone or with co-stimulatory molecules that can directly signal T cells. 
Alternatively antibodies able to bind surface immunoglobulin and other molecules on B cells as 

30 well as antibodies able to bind the TCR and other molecules on T cells can be combined with the 
pharmaceutical composition of the invention. 

The pharmaceutical composition of the invention may be in the form of a liposome in 
which protein of the present invention is combined, in addition to other pharmaceutically 
acceptable carriers, with amphipathic agents such as lipids which exist in aggregated form as 

35 micelles, insoluble monolayers, liquid crystals, or lamellar layers in aqueous solution. Suitable 
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lipids for liposomal formulation include, without limitation, monoglycerides, diglycerides, 
sulfatides, lysolecithins, phospholipids, saponin, bile acids, and the like. Preparation of such 
liposomal formulations is within the level of skill in the art, as disclosed, for example, in U.S. 
Patent Nos. 4,235,871; 4,501,728; 4,837,028; and 4,737,323, all of which are incorporated 

5 herein by reference. 

The amount of protein or other active ingredient of the present invention in the 
pharmaceutical composition of the present invention will depend upon the nature and severity of 
the condition being treated, and on the nature of prior treatments which the patient has 
undergone. Ultimately, the attending physician will decide the amount of protein or other active 
1 0 ingredient of the present invention with which to treat each individual patient. Initially, the 
" attending physician will administer low doses of protein or other active ingredient of the present 
invention and observe the patient's response. Larger doses of protein or other active ingredient 
of the present invention may be administered until the optimal therapeutic effect is obtained for 
the patient, and at that point the dosage is not increased further. It is contemplated that the 
1 5 various pharmaceutical compositions used to practice the method of the present invention should 
contain about 0.01 ug to about 100 mg (preferably about 0.1 ug to about 10 mg, more preferably 
about 0.1 p.g to about 1 mg) of protein or other active ingredient of the present invention per kg 
body weight. For compositions of the present invention which are useful for bone, cartilage, 
tendon or ligament regeneration, the therapeutic method includes administering the composition 
20 topically, systematically, or locally as an implant or device. When administered, the therapeutic, 
composition for use in this invention is, of course, in a pyrogen-free, physiologically acceptable 
form. Further, the composition may desirably be encapsulated or injected in a viscous form for 
delivery to the site of bone, cartilage or tissue damage. Topical administration may be suitable 
for wound healing and tissue repair. Therapeutically useful agents other than a protein or other 
25 active ingredient of the invention which may also optionally be included in the composition as 
described above, may alternatively or additionally, be administered simultaneously or 
sequentially with the composition in the methods of the invention. Preferably for bone and/or 
cartilage formation, the composition would include a matrix capable of delivering the 
protein-containing or other active ingredient-containing composition to the site of bone and/or 
30 cartilage damage, providing a structure for the developing bone and cartilage and optimally 

capable of being resorbed into the body. Such matrices may be formed of materials presently in 
use for other implanted medical applications. 

The choice of matrix material is based on biocompatibility, biodegradability, mechanical 
properties, cosmetic appearance and interface properties. The particular application of the 
35 compositions will define the appropriate formulation. Potential matrices for the compositions 

70 



BNSDOC1D: <WO 0153312A1J_> . 



WO 01/53312 PCT/US00/34263 

may be biodegradable and chemically defined calcium sulfate, tricalcium phosphate, 
hydroxyapatite, poiylactic acid, polyglycoiic acid and polyanhydrides. Other potential materials 
are biodegradable and biologically well-defined, such as bone or dermal collagen. Further 
matrices are comprised of pure proteins or extracellular matrix components. Other potential 

5 matrices are nonbiodegradable and chemically defined, such as sintered hydroxyapatite, bioglass, 
aluminates, or other ceramics. Matrices may be comprised of combinations of any of the above 
mentioned types of material, such as poiylactic acid and hydroxyapatite or collagen and 
tricalcium phosphate. The bioceramics may be altered in composition, such as in 
calcium-aluminate-phosphate and processing to alter pore size, particle size, particle shape, and 

10 biodegradability. Presently preferred is a 50:50 (mole weight) copolymer of lactic acid and 

glycolic acid in the form of porous particles having diameters ranging from 1 50 to 800 microns. 
In some applications, it will be useful to utilize a sequestering agent, such as carboxymethyl 
cellulose or autologous blood clot, to prevent the protein compositions from disassociating from 
the matrix. 

1 5 A preferred family of sequestering agents is cellulosic materials such as alkylcelluloses 

(including hydroxyalkylcelluloses), including methylcellulose, ethylceliulose, 
hydroxyethylcellulose, hydroxypropylcellulose, hydroxypropyl-methylcellulose, and 
carboxymethylcellulose, the most preferred being cationic salts of carboxymethylcellulose 
(CMC). Other preferred sequestering agents include hyaluronic acid, sodium alginate, 

20 poly(ethylene glycol), polyoxyethylene oxide, carboxyvinyl polymer and poly(vinyl alcohol). 
The amount of sequestering agent useful herein is 0.5-20 wt %, preferably 1-10 wt % based on 
total formulation weight, which represents the amount necessary to prevent desorption of the 
protein from the polymer matrix and to provide appropriate handling of the composition, yet not 
so much that the progenitor cells are prevented from infiltrating the matrix, thereby providing the 

25 protein the opportunity to assist the osteogenic activity of the progenitor cells. In further 

compositions, proteins or other active ingredients of the invention may be combined with other 
agents beneficial to the treatment of the bone and/or cartilage defect, wound, or tissue in 
question. These agents include various growth factors such as epidermal growth factor (EGF), 
platelet derived growth factor (PDGF), transforming growth factors (TGF-a and TGF-P), and 

30 insulin-like growth factor (IGF). 

The therapeutic compositions are also presently valuable for veterinary applications. 
Particularly domestic animals and thoroughbred horses, in addition to humans, are desired 
patients for such treatment with proteins or other active ingredients of the present invention. The 
dosage regimen of a protein-containing pharmaceutical composition to be used in tissue 

35 regeneration will be determined by the attending physician considering various factors which 
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modify the action of the proteins, e.g., amount of tissue weight desired to be formed, the site of 
damage, the condition of the damaged tissue, the size of a wound, type of damaged tissue (e.g., 
bone), the patient's age, sex, and diet, the severity of any infection, time of administration and 
other clinical factors. The dosage may vary with the type of matrix used in the ^constitution and 
5 with inclusion of other proteins in the pharmaceutical composition. For example, the addition of 
other known growth factors, such as IGF I (insulin like growth factor I), to the final composition, 
may also effect the dosage. Progress can be monitored by periodic assessment of tissue/bone 
growth and/or repair, for example, X-rays, histomorphometric determinations and tetracycline 
labeling. 

1 0 Polynucleotides of the present invention can also be used for gene therapy. Such 

polynucleotides can be introduced either in vivo or ex vivo into cells for expression in a 
mammalian subject. Polynucleotides of the invention may also be administered by other known 
methods for introduction of nucleic acid into a cell or organism (including, without limitation, in 
the form of viral vectors or naked DNA). Cells may also be cultured ex vivo in the presence of 

1 5 proteins of the present invention in order to proliferate or to produce a desired effect on or 
activity in such cells. Treated cells can then be introduced in vivo for therapeutic purposes. 

4.123 EFFECTIVE DOSAGE 

Pharmaceutical compositions suitable for use in the present invention include 

20 compositions wherein the active ingredients are contained in an effective amount to achieve its 
intended purpose. More specifically, a therapeutically effective amount means an amount 
effective to prevent development of or to alleviate the existing symptoms of the subject being 
treated. Determination of the effective amount is well within the capability of those skilled in 
the art, especially in light of the detailed disclosure provided herein. For any compound used in 

25 the method of the invention, the therapeutically effective dose can be estimated initially from 

appropriate in vitro assays. For example, a dose can be formulated in animal models to achieve a 
circulating concentration range that can be used to more accurately determine useful doses in 
humans. For example, a dose can be formulated in animal models to achieve a circulating 
concentration range that includes the IC50 as determined in cell culture {i.e., the concentration of 

30 the test compound which achieves a half-maximal inhibition of the protein's biological activity). 
Such information can be used to more accurately determine useful doses in humans. 

A therapeutically effective dose refers to that amount of the compound that results in 
amelioration of symptoms or a prolongation of survival in a patient. Toxicity and therapeutic 
efficacy of such compounds can be determined by standard pharmaceutical procedures in cell 

35 cultures or experimental animals, e.g., for determining the LD50 (the dose lethal to 50% of the 
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population) and the ED 50 (the dose therapeutically effective in 50% of the population). The dose 
ratio between toxic and therapeutic effects is the therapeutic index and it can be expressed as the 
ratio between LD50 and ED50. Compounds which exhibit high therapeutic indices are preferred. 
The data obtained from these cell culture assays and animal studies can be used in formulating a 
5 range of dosage for use in human. The dosage of such compounds lies preferably within a range 
of circulating concentrations that include the ED50 with little or no toxicity. The dosage may 
vary within this range depending upon the dosage form employed and the route of administration 
utilized. The exact formulation, route of administration and dosage can be chosen by the 
individual physician in view of the patient's condition. See, e.g., Fingl et al., 1975, in M The 

10 Pharmacological Basis of Therapeutics", Ch. 1 p.l. Dosage amount and interval may be adjusted 
individually to provide plasma levels of the active moiety which are sufficient to maintain the 
desired effects, or minimal effective concentration (MEC). The MEC will vary for each 
compound but can be estimated from in vitro data. Dosages necessary to achieve the MEC will 
depend on individual characteristics and route of administration. However, HPLC assays or 

15 bioassays can be used to determine plasma concentrations. 

Dosage intervals can also be determined using MEC value. Compounds should be 
administered using a regimen which maintains plasma levels above the MEC for 10-90% of the 
time, preferably between 30-90% and most preferably between 50-90%. In cases of local, 
adrninistration or selective uptake, the effective local concentration of the drug may not be 

20 related to plasma concentration. 

An exemplary dosage regimen for polypeptides or other compositions of the invention 
will be in the range of about 0.01 u.g/kg to 100 mg/kg of body weight daily, with the preferred 
dose being about 0.1 u.g/kg to 25 mg/kg of patient body weight daily, varying in adults and 
children. Dosing may be once daily, or equivalent doses may be delivered at longer or shorter 

25 intervals. 

The amount of composition administered will, of course, be dependent on the subject 
being treated, on the subject's age and weight, the severity of the affliction, the manner of 
administration and the judgment of the prescribing physician. 

30 4.12.4 PACKAGING 

The compositions may, if desired, be presented in a pack or dispenser device which may 
contain one or more unit dosage forms containing the active ingredient. The pack may, for 
example, comprise metal or plastic foil, such as a blister pack. The pack or dispenser device may 
be accompanied by instructions for administration. Compositions comprising a compound of the 
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invention formulated in a compatible pharmaceutical carrier may also be prepared, placed in an 
appropriate container, and labeled for treatment of an indicated condition. 

4.13 ANTIBODIES 

5 Also included in the invention are antibodies to proteins, or fragments of proteins of the 

invention. The term "antibody" as used herein refers to immunoglobulin molecules and 
immunologically active portions of immunoglobulin (Ig) molecules, i.e., molecules that contain 
an antigen binding site that specifically binds (immunoreacts with) an antigen. Such antibodies 
include, but are not limited to, polyclonal, monoclonal, chimeric, single chain, Fab> F a b- and F (ab72 
10 fragments, and an F a b expression library. In general, an antibody molecule obtained from 

humans relates to any of the classes IgG, IgM, IgA, IgE and IgD, which differ from one another 
by the nature of the heavy chain present in the molecule. Certain classes have subclasses as well, 
such as IgGi, IgG 2 , and others. Furthermore, in humans, the light chain may be a kappa chain or 
a lambda chain. Reference herein to antibodies includes a reference to all such classes, 
1 5 subclasses and types of human antibody species. 

An isolated related protein of the invention may be intended to serve as an antigen, or a 
portion or fragment thereof, and additionally can be used as an immunogen to generate 
antibodies that immunospecifically bind the antigen, using standard techniques for polyclonal 
and monoclonal antibody preparation. The full-length protein can be used or, alternatively, the 
20 invention provides antigenic peptide fragments of the antigen for use as immunogens. An 

antigenic peptide fragment comprises at least 6 amino, acid residues of the amino acid sequence 
of the full length protein, such as an amino acid sequence shown in SEQ ID NO: 1787, and 
encompasses an epitope thereof such that an antibody raised against the peptide forms a specific 
immune complex with the full length protein or with any fragment that contains the epitope. 
25 Preferably, the antigenic peptide comprises at least 10 amino acid residues, or at least 1 5 amino 
acid residues, or at least 20 amino acid residues, or at least 30 amino acid residues. Preferred 
epitopes encompassed by the antigenic peptide are regions of the protein that are located on its 
surface; commonly these are hydrophilic regions. 

In certain embodiments of the invention, at least one epitope encompassed by the 
30 antigenic peptide is a region of -related protein that is located on the surface of the protein, e.g., a 
hydrophilic region. A hydrophobicity analysis of the human related protein sequence will 
indicate which regions of a related protein are particularly hydrophilic and, therefore, are likely 
to encode surface residues useful for targeting antibody production. As a means for targeting 
antibody production, hydropathy plots showing regions of hydrophilicity and hydrophobicity 
35 may be generated by any method well known in the art, including, for example, the Kyte 
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Doolittle or the Hopp Woods methods, either with or without Fourier transformation. See, e.g,, 
Hopp and Woods, 1981, Proc. Nat Acad Set USA 78: 3824-3828; Kyte and Doolittle 1982, J. 
Mol Biol. 157: 105-142, each of which is incorporated herein by reference in its entirety. 
Antibodies that are specific for one or more domains within an antigenic protein, or derivatives, 
5 fragments, analogs or homologs thereof, are also provided herein. 

A protein of the invention, or a derivative, fragment, analog, homolog or ortholog 
thereof, may be utilized as an immunogen in the generation of antibodies that 
immunospecifically bind these protein components. 

Various procedures known within the art may be used for the production of polyclonal or 
1 0 monoclonal antibodies directed against a protein of the invention, or against derivatives, 1 
fragments, analogs homologs or orthologs thereof (see, for example, Antibodies: A Laboratory 
Manual, Harlow E, and Lane D, 1 988, Cold Spring Harbor Laboratory Press, Cold Spring 
Harbor, NY, incorporated herein by reference). Some of these antibodies are discussed below. 

15 5.13,1 Polyclonal Antibodies 

For the production of polyclonal antibodies, various suitable host animals (e.g., rabbit, 
goat, mouse or other mammal) may be immunized by one or more injections with the native 
protein, a synthetic variant thereof, or a derivative of the foregoing. An appropriate 
immunogenic preparation can contain, for example, the naturally occurring immunogenic 

20 protein, a chemically synthesized polypeptide representing the immunogenic protein, or a 

recombinantly expressed immunogenic protein. Furthermore, the protein may be conjugated to 
a second protein known to be immunogenic in the mammal beiug immunized. Examples of such 
immunogenic proteins include but are not limited to keyhole limpet hemocyanin, serum albumin, 
bovine thyroglobulin, and soybean trypsin inhibitor. The preparation can further include an 

25 adjuvant. Various adjuvants used to increase the immunological response include, but are not 

limited to, Freund's (complete and incomplete), mineral gels (e.g., aluminum hydroxide), surface 
active substances (e.g., lysolecithin, pluronic polyols, polyanions, peptides, oil emulsions, 
dinitrophenol, etc.), adjuvants usable in humans such as Bacille Calmette-Guerin and 
Corynebacterium parvum, or similar immunostimulatory agents. Additional examples of 

30 adjuvants which can be employed include MPL-TDM adjuvant (monophosphoryl Lipid A, 
synthetic trehalose dicorynomycolate). 

The polyclonal antibody molecules directed against the immunogenic protein can be 
isolated from the mammal (e.g., from the blood) and further purified by well known techniques, 
such as affinity chromatography using protein A or protein G, which provide primarily the IgG 

35 fraction of immune serum. Subsequently, or alternatively, the specific antigen which is the 
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target of the immunoglobulin sought, or an epitope thereof, may be immobilized on a column to 
purify the immune specific antibody by immunoafifiriity chromatography. Purification of 
immunoglobulins is discussed, for example, by D. Wilkinson (The Scientist, published by The 
Scientist, Inc., Philadelphia PA, Vol. 14, No. 8 (April 17, 2000), pp. 25-28). 

5 

5 J3.2 Monoclonal Antibodies 

The term "monoclonal antibody" (MAb) or "monoclonal antibody composition", as used 
herein, refers to a population of antibody molecules that contain only one molecular species of 
antibody molecule consisting of a unique light chain gene product and a unique heavy chain 

10 gene product. In particular, the complementarity determining regions (CDRs) of the monoclonal 
antibody are identical in all the molecules of the population. MAbs thus contain an antigen 
binding site capable of immunoreacting with a particular epitope of the antigen characterized by 
a unique binding affinity for it. 

Monoclonal antibodies can be prepared using hybridoma methods, such as those 

1 5 described by Kohler and Milstein, Nature, 256:495 (1 975). In a hybridoma method, a mouse, 
hamster, or other appropriate host animal, is typically immunized with an immunizing agent to 
elicit lymphocytes that produce or are capable of producing antibodies that will specifically bind 
to the immunizing agent. Alternatively, the lymphocytes can be immunized in vitro. 
The immunizing agent will typically include the protein antigen, a fragment thereof or a fusion 

20 protein thereof. Generally, either peripheral blood lymphocytes are used if ceils of human origin 
are desired, or spleen cells or lymph node cells are used if non-human mammalian sources are 
desired. The lymphocytes are then fused with an immortalized cell line using a suitable fusing 
agent, such as polyethylene glycol, to form a hybridoma cell (Goding, Monoclon al Antibodies: 
Principles and Practice, Academic Press, (1986) pp. 59-103). Immortalized cell lines are usually 

25 transformed mammalian cells, particularly myeloma cells of rodent, bovine and human origin. 
Usually, rat or mouse myeloma cell lines are employed. The hybridoma cells can be cultured in 
a suitable culture medium that preferably contains one or more substances that inhibit the growth 
or survival of the unfused, immortalized cells. For example, if the parental cells lack the enzyme 
hypoxanthine guanine phosphoribosyl transferase (HGPRT or HPRT), the culture medium for 

30 the hybridomas typically will include hypoxanthine, aminopterin, and thymidine ("HAT 
medium' 1 ), which substances prevent the growth of HGPRT-deficient cells. 

Preferred immortalized cell lines are those that fuse efficiently, support stable high level 
expression of antibody by the selected antibody-producing ceils, and are sensitive to a medium 
such as HAT medium. More preferred immortalized cell lines are murine myeloma lines, which 

3 5 can be obtained, for instance, from the Salk Institute Cell Distribution Center, San Diego, 
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California and the American Type Culture Collection, Manassas, Virginia. Human myeloma and 
mouse-human heteromyeloma cell lines also have been described for the production of human 
monoclonal antibodies (Kozbor, J. Immunol., 133:3001 (1984); Brodeur et al., Monoclonal 
Antibody Production Techniques and Applications, Marcel Dekker, Inc., New York, (1987) pp. 

5 51-63). 

The culture medium in which the hybridoma cells are cultured can then be assayed for 
the presence of monoclonal antibodies directed against the antigen. Preferably, the binding 
specificity of monoclonal antibodies produced by the hybridoma cells is detennined by 
immunoprecipitation or by an in vitro binding assay, such as radioimmunoassay (RI A) or 
10 enzyme-linked irrimunoabsorbent assay (ELISA). Such techniques and assays are known in the 
art. The binding affinity of the monoclonal antibody can, for example, be determined by the 
Scatchard analysis of Munson and Pollard, Anal. Biochem., 107:220 (1980). Preferably, 
antibodies having a high degree of specificity and a high binding affinity for the target antigen 
are isolated. 

15 After the desired hybridoma cells are identified, the clones can be subcloned by limiting 

dilution procedures and grown by standard methods. Suitable culture media for this purpose 
include, for example, Dulbecco's Modified Eagle's Medium and RPMI-1640 medium. 
Alternatively, the hybridoma cells can be grown in vivo as ascites in a mammal. 
The monoclonal antibodies secreted by the subclones can be isolated or purified from the culture 

20 medium or ascites fluid by conventional immunoglobulin purification procedures such as, for 

example, protein A-Sepharose, hydroxy lapatite chromatography, gel electrophoresis, dialysis, or 

affinity chromatography. 

The monoclonal antibodies can also be made by recombinant DNA methods, such as 
those described in U.S. Patent No. 4,816,567. DNA encoding the monoclonal antibodies of the 

25 invention can be readily isolated and sequenced using conventional procedures (e.g., by using 
oligonucleotide probes that are capable of binding specifically to genes encoding the heavy and 
light chains of murine antibodies). The hybridoma cells of the invention serve as a preferred 
source of such DNA. Once isolated, the DNA can be placed into expression vectors, which are 
then transfected into host cells such as simian COS cells, Chinese hamster ovary (CHO) cells, or 

30 myeloma cells that do not otherwise produce immunoglobulin protein, to obtain the synthesis of 
monoclonal antibodies in the recombinant host cells. The DNA also can be modified, for 
example, by substituting the coding sequence for human heavy and light chain constant domains 
in place of the homologous murine sequences (U.S. Patent No. 4,8 1 6,567; Morrison, Nature 368, 
812-13(1 994)) or by covalently joining to the immunoglobulin coding sequence all or part of the 

35 coding sequence for a non-immunoglobulin polypeptide. Such a non-immunoglobulin 
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polypeptide can be substituted for the constant domains of an antibody of the invention, or can 
be substituted for the variable domains of one antigen-combining site of an antibody of the 
invention to create a chimeric bivalent antibody. 



5 5. 13,2 Humanized Antibodies 

The antibodies directed against the protein antigens of the invention can further comprise 
humanized antibodies ot human antibodies. These antibodies are suitable for administration to 
humans without engendering an immune response by the human against the administered 
immunoglobulin. Humanized forms of antibodies are chimeric immunoglobulins, 

1 0 immunoglobulin chains or fragments thereof (such as Fv, Fab, Fab', F(ab , )2 or other antigen- 
binding subsequences of antibodies) that are principally comprised of the sequence of a human 
immunoglobulin, and contain minimal sequence derived from a non-human immunoglobulin. 
Humanization can be performed following the method of Winter and co-workers (Jones et al., 
Nature, 321:522-525 (1986); Riechmann et al., Nature , 332:323-327 (1988); Verhoeyen et al., 

15 Science, 239:1534-1536 (1988)), by substituting rodent CDRs or CDR sequences for the 

corresponding sequences of a human antibody. (See also U.S. Patent No. 5,225,539.) In some 
instances, Fv framework residues of the human immunoglobulin are replaced by corresponding 

m 

non-human residues. Humanized antibodies can also comprise residues which are found neither 
in the recipient antibody nor in the imported CDR or framework sequences. In general, the 

20 humanized antibody will comprise substantially all of at least one, and typically two, variable 

domains, in which all or substantially all of the CDR regions correspond to those of a non-human 
immunoglobulin and all or substantially all of the framework regions are those of a human 
immunoglobulin consensus sequence. The humanized antibody optimally also will comprise at 
least a portion of an immunoglobulin constant region (Fc), typically that of a human 

25 immunoglobulin (Jones et al., 1986; Riechmann et al., 1988; and Presta, Curr. Op. Struct. Biol., 
2:593-596(1992)). 

5.13.3 Human Antibodies 

Fully human antibodies relate to antibody molecules in which essentially the entire 
30 sequences of both the light chain and the heavy chain, including the CDRs, arise from human 
genes. Such antibodies are termed "human antibodies", or "fully human antibodies" herein. 
Human monoclonal antibodies can be prepared by the trioma technique; the human B-cell 
hybridoma technique (see Kozbor, et al., 1983 Immunol Today 4: 72) and the EBV hybridoma 
technique to produce human monoclonal antibodies (see Cole, et al., 1 985 In: MONOCLONAL 
35 Antibodies and Cancer Therapy, Alan R. Liss, Inc., pp. 77-96). Human monoclonal 
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antibodies may be utilized in the practice of the present invention and may be produced by using 
human hybridomas (see Cote, et aL, 1983. Proc Natl Acad Sci USA 80: 2026-2030) or by 
transforming human B-cells with Epstein Barr Virus in vitro (see Cole, et aL, 1985 In: 
Monoclonal Antibodies and Cancer Thekapy, Alan R. Liss, Inc., pp. 77-96). 

5 In addition, human antibodies can also be produced using additional techniques, 

including phage display libraries (Hoogenboom and Winter, J. Mol. BioL. 227 :381 (1991); 
Marks et aL, J. MoL Biol., 222:581 (1 991)). Similarly, human antibodies can be made by 
introducing human immunoglobulin loci into transgenic animals, e.g., mice in which the 
endogenous immunoglobulin genes have been partially or completely inactivated Upon 

10 challenge, human antibody production is observed, which closely resembles that seen in humans 
in all respects, including gene rearrangement, assembly, and antibody repertoire. This approach 
is described, for example, in U.S. Patent Nos. 5,545,807; 5,545,806; 5,569,825; 5,625,126; 
5,633,425; 5,661,016, and in Marks et aL (Bio/Technology 10, 779-783 (1992)); Lonberg et aL 
(Nature 368 856-859 (1994)); Morrison ( Nature 368, 812-13 (1994)); Fishwild et al,( Nature 

15 Biotechnology 14, 845-51 (1996)); Neuberger (Nature Biotechnology 14, 826 (1996)); and 
Lonberg and Huszar (Intern. Rev. Immunol. 13 65-93 (1995)). 

Human antibodies may additionally be produced using transgenic nonhuman animals 
which are modified so as to produce fully human antibodies rather than the animal's endogenous 
antibodies in response to challenge by an antigen. (See PCT publication WO94/02602). The 

20 endogenous genes encoding the heavy and light immunoglobulin chains in the nonhuman host 

have been incapacitated, and active loci encoding human heavy and light chain immunoglobulins 
are inserted into the host's genome. The human genes are incorporated, for example, using yeast 
artificial chromosomes containing the requisite human DNA segments. An animal which 
provides all the desired modifications is then obtained as progeny by crossbreeding intermediate 

25 transgenic animals containing fewer than the full complement of the modifications. The 

preferred embodiment of such a nonhuman animal is a mouse, and is termed the Xenomouse 
as disclosed in PCT publications WO 96/33735 and WO 96/34096. This animal produces B cells 
which secrete fully human immunoglobulins. The antibodies can be obtained directly from the 
animal after immunization with an immunogen of interest, as, for example, a preparation of a 

30 polyclonal antibody, or alternatively from immortalized B cells derived from the animal, such as 
hybridomas producing monoclonal antibodies. Additionally, the genes encoding the 
immunoglobulins with human variable regions can be recovered and expressed to obtain the 
antibodies directly, or can be further modified to obtain analogs of antibodies such as, for 
example, single chain Fv molecules. 
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An example of a method of producing a nonhuman host, exemplified as a mouse, lacking 
expression of an endogenous immunoglobulin heavy chain is disclosed in U.S. Patent No. 
5,939,598. It can be obtained by a method including deleting the J segment genes from at least 
one endogenous heavy chain locus in an embryonic stem cell to prevent rearrangement of the 
5 locus and to prevent formation of a transcript of a rearranged immunoglobulin heavy chain locus, 
the deletion being effected by a targeting vector containing a gene encoding a selectable marker, 
and producing from the embryonic stem cell a transgenic mouse whose somatic and germ cells 
contain the gene encoding the selectable marker. 

A method for producing an antibody of interest, such as a human antibody, is disclosed in 
10 U.S. Patent No. 5,916,771 . It includes introducing an expression vector that contains a 

nucleotide sequence encoding a heavy chain into one mammalian host cell in culture, introducing 
an expression vector containing a nucleotide sequence encoding a light chain into another 
mammalian host cell, and fusing the two cells to form a hybrid cell. The hybrid cell expresses an 
antibody containing the heavy chain and the light chain. 
15 In a further improvement on this procedure, a method for identifying a clinically relevant 

epitope on an immunogen, and a correlative method for selecting an antibody that binds 
immunospecifically to the relevant epitope with high affinity, are disclosed in PCT publication 
WO 99/53049. 

20 5.13.4 F ab Fragments and Single Chain Antibodies 

According to the invention, techniques can be adapted for the production of single-chain 
antibodies specific to an antigenic protein of the invention (see e.g., U.S. Patent No. 4,946,778). 
In addition, methods can be adapted for the construction of F ab expression libraries (see e.g., 
Huse, et al., 1989 Science 246: 1275-1281) to allow rapid and effective identification of 

25 monoclonal F ab fragments with the desired specificity for a protein or derivatives, fragments, 

analogs or homologs thereof. Antibody fragments that contain the idiotypes to a protein antigen 
may be produced by techniques known in the art including, but not limited to: (i) an F (ab -)2 
fragment produced by pepsin digestion of an antibody molecule; (ii) an F ab fragment generated 
by reducing the disulfide bridges of an F W2 fragment; (iii) an F ab fragment generated by the 

30 treatment of the antibody molecule with papain and a reducing agent and (iv) F v fragments. 

5.13.5 Bispecific Antibodies 

Bispecific antibodies are monoclonal, preferably human or humanized, antibodies that 
have binding specificities,for at least two different antigens. In the present case, one of the 
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binding specificities is for an antigenic protein of the invention. The second binding target is any 
other antigen, and advantageously is a cell-surface protein or receptor or receptor subunit. 

Methods for making bispeciftc antibodies are known in the art Traditionally, the 
recombinant production of bispecific antibodies is based on the co-expression of two 
5 immunoglobulin heavy-chain/light-chain pairs, where the two heavy chains have different 
specificities (Milstein and Cuello, Nature, 305:537-539 (1983)). Because of the random 
assortment of immunoglobulin heavy and light chains, these hybridomas (quadromas) produce a 
potential mixture often different antibody molecules, of which only one has the correct 
bispecific structure. The purification of the correct molecule is usually accomplished by affinity 
10 chromatography steps. Similar procedures are disclosed in WO 93/08829, published 13 May 
1993, and in Traunecker et */., 1991 EMBOJ., 10:3655-3659. 

Antibody variable domains with the desired binding specificities (antibody-antigen 
combining sites) can be fused to immunoglobulin constant domain sequences. The fusion 
preferably is with an immunoglobulin heavy-chain constant domain, comprising at least part of 
1 5 the hinge, CH2, and CH3 regions. It is preferred to have the first heavy-chain constant region 
(CHI) containing the site necessary for light-chain binding present in at least one of the fusions. 
DNAs encoding the immunoglobulin heavy-chain fusions and, if desired, the immunoglobulin 
light chain, are inserted into separate expression vectors, and are co-transfected into a suitable 
host organism. For further details of generating bispecific antibodies see, for example, Suresh et 
20 ah, Methods in Enzvmology, 121:210 (1986). 

According to another approach described in WO 96/2701 1, the interface between a pair 
of antibody molecules can be engineered to maximize the percentage of heterodimers which are 
recovered from recombinant cell culture. The preferred interface comprises at least a part of the 
CH3 region of an antibody constant domain. In this method, one or more small amino acid side 
25 chains from the interface of the first antibody molecule are replaced with larger side chains (e.g. 
tyrosine or tryptophan). Compensatory "cavities" of identical or similar size to the large side 
chain(s) are created on the interface of the second antibody molecule by replacing large amino 
acid side chains with smaller ones (e.g. alanine or threonine). This provides a mechanism for 
increasing the yield of the heterodimer over other unwanted end-products such as homodimers. 
30 Bispecific antibodies can be prepared as full length antibodies or antibody fragments (e.g. 

F(ab')2 bispecific antibodies). Techniques for generating bispecific antibodies from antibody 
fragments have been described in the literature. For example, bispecific antibodies can be 
prepared using chemical linkage. Brennan et al., Science 229:81 (1985) describe a procedure 
wherein intact antibodies are proteolytically cleaved to generate F(ab')2 fragments. These 
35 fragments are reduced in the presence of the dithiol complexing agent sodium arsenite to 
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stabilize vicinal dithiols and prevent intermoiecular disulfide formation. The Fab' fragments 
generated are then converted to thionitrobenzoate (TNB) derivatives. One of the Fab* -TNB 
derivatives is then reconverted to the Fab'-thiol by reduction with mercaptoethylamine and is 
mixed with an equimolar amount of the other Fab' -TNB derivative to form the bispecific 
5 antibody. The bispecific antibodies produced can be used as agents for the selective 
immobilization of enzymes. 

Additionally, Fab' fragments can be directly recovered from E. coli and chemically 
coupled to form bispecific antibodies. Shalaby et al., J. Exp. Med. 175:217-225 (1992) describe 
the production of a fully humanized bispecific antibody F(ab')2 molecule. Each Fab' fragment 
10 was separately secreted from E. coli and subjected to directed chemical coupling in vitro to form 
the bispecific antibody. The bispecific antibody thus formed was able to bind to cells 
overexpressing the ErbB2 receptor and normal human T cells, as well as trigger the lytic activity 
of human cytotoxic lymphocytes against human breast tumor targets. 

Various techniques for making and isolating bispecific antibody fragments directly from 
1 5 recombinant cell culture have also been described. For example, bispecific antibodies have been 
produced using leucine zippers. Kostelny et ai., J. Immunol. 148(5):1547-1553 (1992). The 
leucine zipper peptides from the Fos and Jun proteins were linked to the Fab' portions of two 
different antibodies by gene fusion. The antibody homodimers were reduced at the hinge region 
to form monomers and then re-oxidized to form the antibody heterodimers. This method can 
20 also be utilized for the production of antibody homodimers. The "diabody 99 technology 

described by Hollinger et al., Proc. Natl. Acad. Sci. USA 90:6444-6448 (1993) has provided an 
alternative mechanism for making bispecific antibody fragments. The fragments comprise a 
heavy-chain variable domain (V H ) connected to a light-chain variable domain (V L ) by a linker 
which is too short to allow pairing between the two domains on the same chain. Accordingly, 
25 the V H and V L domains of one fragment are forced to pair with the complementary V L and Vr 
domains of another fragment, thereby forming two antigen-binding sites. Another strategy for 
making bispecific antibody fragments by the use of single-chain Fv (sFv) dimers has also been 
reported. See, Gruber et al., J. Immunol. 152:5368 (1994). 

Antibodies with more than two valencies are contemplated. For example, trispecific 
30 antibodies can be prepared. Tutt et al., J. Immunol. 147:60 (1991). 

Exemplary bispecific antibodies can bind to two different epitopes, at least one of which 
originates in the protein antigen of the invention. Alternatively, an anti-antigenic arm of an 
immunoglobulin molecule can be combined with an arm which binds to a triggering molecule on 
a leukocyte such as a T-cell receptor molecule (e.g. CD2, CD3, CD28, or B7), or Fc receptors for 
35 IgG (FcyR), such as FcyRI (CD64), FcyRII (CD32) and FcyRffl (CD16) so as to focus cellular 
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defense mechanisms to the cell expressing the particular antigen. Bispecific antibodies can also 
be used to direct cytotoxic agents to cells which express a particular antigen. These antibodies 
possess an antigen-binding arm and an arm which binds a cytotoxic agent or a radionuclide 
chelator, such as EOTUBE, DPTA, DOTA, or TETA. Another bispecific antibody of interest 
5 binds the protein antigen described herein and further binds tissue factor (TF). 

5.13.6 Heterocon jugate Antibodies 

Heteroconjugate antibodies are also within the scope of the present invention. 
Heteroconjugate antibodies are composed of two covalently joined antibodies. Such antibodies 

1 0 have, for example, been proposed to target immune system cells to unwanted cells (U.S. Patent 
No. 4,676,980), and for treatment of HIV infection (WO 91/00360; WO 92/200373; EP 03089). 
It is contemplated that the antibodies can be prepared in vitro using known methods in synthetic 
protein chemistry, including those involving crosslinking agents. For example, immunotoxins 
can be constructed using a disulfide exchange reaction or by forming a thioether bond. 

1 5 Examples of suitable reagents for this purpose include iminothiolate and methyl-4- 

mercaptobutyrimidate and those disclosed, for example, in U.S. Patent No. 4,676,980. 

5.13.7 Effector Function Engineering 

It can be desirable to modify the antibody of the invention with respect to effector function, so as 
20 to enhance, e.g., the effectiveness of the antibody in treating cancer. For example, cysteine 
residue(s) can be introduced into the Fc region, thereby allowing interchain disulfide bond 
formation in this region. The homodimeric antibody thus generated can have improved 
internalization capability and/or increased complement-mediated cell killing and antibody- 
dependent cellular cytotoxicity (ADCC). See Caron et al., J. Exp Med., 176: 1 191-1 195 (1992) 
25 and Shopes, J. Immunol., 148: 291 8-2922 (1992). Homodimeric antibodies with enhanced anti- 
tumor activity can also be prepared using heterobifunctional cross-linkers as described in Wolff 
et al. Cancer Research, 53: 2560-2565 (1993). Alternatively, an antibody can be engineered that 
has dual Fc regions and can thereby have enhanced complement lysis and ADCC capabilities. 
See Stevenson et al., Anti-Cancer Drug Design, 3: 219-230 (1989). 



30 



5.13,8 Immunoconjugates 

The invention also pertains to imrnunoconjugates comprising an antibody conjugated to a 
cytotoxic agent such as a chemotherapeutic agent, toxin (e.g., an enzymatically active toxin of 
bacterial, fungal, plant, or animal origin, or fragments thereof), or a radioactive isotope (i.e., a 
5 radioconjugate). 
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Chemotherapeutic agents useful in the generation of such immunoconjugates have been* 
described above. Enzymatically active toxins and fragments thereof that can be used include 
diphtheria A chain, nonbinding active fragments of diphtheria toxin, exotoxin A chain (from 
Pseudomonas aeruginosa), ricin A chain, abrin A chain, modeccin A chain, alpha-sarcin, 
5 Aleurites fordii proteins, dianthin proteins, Phytolaca americana proteins (PAPI, PAPII, and 
PAP-S), momordica charantia inhibitor, curcin, crotin, sapaonaria officinalis inhibitor, gelonin, 
mitogellin, restrictocin, phenomycin, enornycin, and the tricothecenes. A variety of 
radionuclides are available for the production of radioconjugated antibodies. Examples include 
2I2 Bi, ,3I I, "V^Y.and '86 Re . 

10 Conjugates of the antibody and cytotoxic agent are made using a variety of bifunctional 

protein-coupling agents such as N-succinimidyl-3-(2-pyridyldithiol) propionate (SPDP), 
irninothiolane (IT), bifunctional derivatives of imidoesters (such as dimethyl adipimidate HCL), 
active esters (such as disuccinimidyl suberate), aldehydes (such as giutareldehyde), bis-azido 
compounds (such as bis (p-azidobenzoyl) hexanediamine), bis-diazonium derivatives (such as 

15 bis-(p-diazoniumbenzoyl)-ethylenediamine), diisocyanates (such as tolyene 2,6-diisocyanate), 
and bis-active fluorine compounds (such as l,5-difluoro-2,4-dinitrobenzene). For example, a 
ricin immunotoxin can be prepared as described in Vitetta et ah, Science, 238: 1098 (1987). 
Carbon- 14-iabeled l-isothiocyanatobenzyI-3-mcthyldiethyIene triaminepentaacetic acid (MX- 
DTP A) is an exemplary chelating agent for conjugation of radionucleotide to the antibody. See 

20 WO94/11026. 

In another embodiment, the antibody can be conjugated to a "receptor" (such 
streptavidin) for utilization in tumor pretargeting wherein the antibody-receptor conjugate is 
administered to the patient, followed by removal of unbound conjugate from the circulation 
using a clearing agent and then administration of a "Iigand" (e.g., avidin) that is in turn 
25 conjugated to a cytotoxic agent. 

4.14 COMPUTER READABLE SEQUENCES 

In one application of this embodiment, a nucleotide sequence of the present invention can 
be recorded on computer readable media. As used herein, "computer readable media" refers to 

30 any medium which can be read and accessed directly by a computer. Such media include, but 
are not limited to: magnetic storage media, such as floppy discs, hard disc storage medium, and 
magnetic tape; optical storage media such as CD-ROM; electrical storage media such as RAM 
and ROM; and hybrids of these categories such as magnetic/optical storage media. A skilled 
artisan can readily appreciate how any of the presently known computer readable mediums can 

35 be used to create a manufacture comprising computer readable medium having recorded thereon 
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a nucleotide sequence of the present invention. As used herein, "recorded" refers to a process for. 
storing information on computer readable medium. A skilled artisan can readily adopt any of the 
presently known methods for recording information on computer readable medium to generate 
manufactures comprising the nucleotide sequence information of the present invention. 

A variety of data storage structures are available to a skilled artisan for creating a 
computer readable medium having recorded thereon a nucleotide sequence of the present 
invention. The choice of the data storage structure will generally be based on the means chosen 
to access the stored information. In addition, a variety of data processor programs and formats 
can be used to store the nucleotide sequence information of the present invention on computer 
readable medium. The sequence information can be represented in a word processing text file, 
formatted in commercially-available software such as WordPerfect and Microsoft Word, or 
represented in the form of an ASCII file, stored in a database application, such as DB2, Sybase, 
Oracle, or the like. A skilled artisan can readily adapt any number of data processor structuring 
formats {e.g. text file or database) in order to obtain computer readable medium having recorded 
thereon the nucleotide sequence information of the present invention. 

By providing any of the nucleotide sequences SEQ IDNO:l-1786 and 3573-5358 or a 
representative fragment thereof; or a nucleotide sequence at least 95% identical to any of the 
nucleotide sequences of SEQ ID NO:l-1786 and 3573-5358 in computer readable form, a skilled 
artisan can routinely access the sequence information for a variety of purposes. Computer 
software is publicly available which allows a skilled artisan to access sequence information 
provided in a computer readable medium. The examples which follow demonstrate how 
software which implements the BLAST (Altschul et ah, J. MoL Biol. 215:403-410 (1990)) and 
BLAZE (Brutlag et al., Comp. Chem. 17:203-207 (1993)) search algorithms on a Sybase system 
is used to identify open reading frames (ORFs) within a nucleic acid sequence. Such ORFs may 
be protein encoding fragments and may be useful in producing commercially important proteins 
such as enzymes used in fermentation reactions and in the production of commercially useful 
metabolites. 

As used herein, "a computer-based system" refers to the hardware means, software 
means, and data storage means used to analyze the nucleotide sequence information of the 
present invention. The minimum hardware means of the computer-based systems of the present 
invention comprises a central processing unit (CPU), input means, output means, and data 
storage means. A skilled artisan can readily appreciate that any one of the currently available 
computer-based systems are suitable for use in the present invention. As stated above, the 
computer-based systems of the present invention comprise a data storage means having stored 
therein a nucleotide sequence of the present invention and the necessary hardware means and 
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software means for supporting and implementing a search means. As used herein, "data storage 

means" refers to memory which can store nucleotide sequence information of the present 

invention, or a memory access means which can access manufactures having recorded thereon 

the nucleotide sequence information of the present invention. 

5 As used herein, "search means 11 refers to one or more programs which are implemented 

on the computer-based system to compare a target sequence or target structural motif with the 

sequence information stored within the data storage means. Search means are used to identify 

fragments or regions of a known sequence which match a particular target sequence or target 

motif. A variety of known algorithms are disclosed publicly and a variety of commercially 

1 0 available software for conducting search means are and can be used in the computer-based 
systems of the present invention. Examples of such software includes, but is not limited to, 
Smith- Waterman, MacPattern (EMBL), BLASTN and BLASTA (NPOLYPEPTIDEIA). A 
skilled artisan can readily recognize that any one of the available algorithms or implementing 
software packages for conducting homology searches can be adapted for use in the present 

1 5 computer-based systems. As used herein, a "target sequence" can be any nucleic acid or amino 
acid sequence of six or more nucleotides or two or more amino acids. A skilled artisan can 
readily recognize that the longer a target sequence is, the less likely a target sequence will be 
present as a random occurrence in the database. The most preferred sequence length of a target 
sequence is from about 10 to 300 amino acids, more preferably from about 30 to 100 nucleotide 

20 residues. However, it is well recognized that searches for commercially important fragments, 
such as sequence fragments involved in gene expression and protein processing, may be of 
shorter length. 

As used herein, "a target structural motif," or "target motif," refers to any rationally 
selected sequence or combination of sequences in which the sequence(s) are chosen based on a 
25 three-dimensional configuration which is formed upon the folding of the target motif. There are 
a variety of target motifs known in the art. Protein target motifs include, but are not limited to, 
enzyme active sites and signal sequences. Nucleic acid target motifs include, but are not limited 
to, promoter sequences, hairpin structures and inducible expression elements (protein binding 
sequences). 

30 

4.15 TRIPLE HELIX FORMATION 

In addition, the fragments of the present invention, as broadly described, can be used to 
control gene expression through triple helix formation or antisense DNA or RNA, both of which 
methods are based on the binding of a polynucleotide sequence to DNA or RNA. 
35 Polynucleotides suitable for use in these methods are preferably 20 to 40 bases in length and are 
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designed to be complementary to a region of the gene involved in transcription (triple helix - see 
Lee et a!., NucL Acids Res. 6:3073 (1979); Cooney et al., Science 15241 :456 (1988); and Dervan 
et al., Science 251:1360 (1991)) or to the mRNA itself (antisense - Olmno, J. Neurochem. 56:560 
(1991); Oligodeoxynucleotides as Antisense Inhibitors of Gene Expression, CRC Press, Boca 
Raton, FL (1988)). Triple helix-formation optimally results in a shut-off of RNA transcription 
from DNA, while antisense RNA hybridization blocks translation of an mRNA molecule into 
polypeptide. Both techniques have been demonstrated to be effective in model systems. 
Information contained in the sequences of the present invention is necessary for the design of an 
antisense or triple helix oligonucleotide. 



4.16 DIAGNOSTIC ASSAYS AND KITS 

The present invention further provides methods to identify the presence or expression of 
one of the ORFs of the present invention, or homolog thereof, in a test sample, using a nucleic 
acid probe or antibodies of the present invention, optionally conjugated or otherwise associated 

1 5 with a suitable label. 

In general, methods for detecting a polynucleotide of the invention can comprise 
contacting a sample with a compound that binds to and forms a complex with the polynucleotide 
for a period sufficient to form the complex, and detecting the complex, so that if a complex is 
detected, a polynucleotide of the invention is detected in the sample. Such methods can also 

20 comprise contacting a sample under stringent hybridization conditions with nucleic acid primers 
that anneal to a polynucleotide of the invention under such conditions, and amplifying annealed 
polynucleotides, so that if a polynucleotide is amplified, a polynucleotide of the invention is 

detected in the sample. 

In general, methods for detecting a polypeptide of the invention can comprise contacting 
25 a sample with a compound that binds to and forms a complex with the polypeptide for a period 
sufficient to form the complex, and detecting the complex, so that if a complex is detected, a 
polypeptide of the invention is detected in the sample. 

In detail, such methods comprise incubating a test sample with one or more of the 
antibodies or one or more of the nucleic acid probes of the present invention and assaying for 
30 binding of the nucleic acid probes or antibodies to components within the test sample. 

Conditions for incubating a nucleic acid probe or antibody with a test sample vary. 
Incubation conditions depend on the format employed in the assay, the detection methods 
employed, and the type and nature of the nucleic acid probe or antibody used in the assay. One 
skilled in the art will recognize that any one of the commonly available hybridization, 
35 amplification or immunological assay formats can readily be adapted to employ the nucleic acid 
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probes or antibodies of the present invention. Examples of such assays can be found in Chard, 
T., An Introduction to Radioimmunoassay and Related Techniques, Elsevier Science Publishers, 
Amsterdam, The Netherlands (1986); Bullock, G.R. et al., Techniques in Immunocytochemistry, 
Academic Press, Orlando, FL Vol. 1 (1982), Vol. 2 (1983), Vol. 3 (1985); Tijssen, P., Practice 
and Theory of immunoassays: Laboratory Techniques in Biochemistry and Molecular Biology, 
Elsevier Science Publishers, Amsterdam, The Netherlands (1985). The test samples of the 
present invention include cells, protein or membrane extracts of cells, or biological fluids such as 
sputum, blood, serum, plasma, or urine. The test sample used in the above-described method 
will vary based on the assay format, nature of the detection method and the tissues, cells or 
extracts used as the sample to be assayed. Methods for preparing protein extracts or membrane 
extracts of cells are well known in the art and can be readily be adapted in order to obtain a 
sample which is compatible with the system utilized. 

In another embodiment of the present invention, kits are provided which contain the 
necessary reagents to carry out the assays of the present invention. Specifically, the invention 
provides a compartment kit to receive, in close confinement, one or more containers which 
comprises: (a) a first container comprising one of the probes or antibodies of the present 
invention; and (b) one or more other containers comprising one or more of the following: wash 
reagents, reagents capable of detecting presence of a bound probe or antibody. 

In detail, a compartment kit includes any kit in which reagents are contained in separate 
containers. Such containers include small glass containers, plastic containers or strips of plastic 
or paper. Such containers allows one to efficiently transfer reagents from one compartment to 
another compartment such that the samples and reagents are not cross-contaminated, and the 
agents or solutions of each container can be added in a quantitative fashion from one 
compartment to another. Such containers will include a container which will accept the test 
sample, a container which contains the antibodies used in the assay, containers which contain 
wash reagents (such as phosphate buffered saline, Tris-buffers, etc.), and containers which 
contain the reagents used to detect the bound antibody or probe. Types of detection reagents 
include labeled nucleic acid probes, labeled secondary antibodies, or in the alternative, if the 
primary antibody is labeled, the enzymatic, or antibody binding reagents which are capable of 
reacting with the labeled antibody. One skilled in the art will readily recognize that the disclosed 
probes and antibodies of the present invention can be readily incorporated into one of the 
established kit formats which are well known in the art. 

4.17 MEDICAL IMAGING 
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The novel polypeptides and binding partners of the invention are useful in medical 
imaging of sites expressing the molecules of the invention (e.g., where the polypeptide of the 
invention is involved in the immune response, for imaging sites of inflammation or infection). 
See, e.g., Kunkel et aL, U.S. Pat NO. 5,413,778. Such methods involve chemical attachment of 
5 a labeling or imaging agent, administration of the labeled polypeptide to a subject in a 

pharmaceutically acceptable carrier, and imaging the labeled polypeptide in vivo at the target 
site. 

4.18 SCREENING ASSAYS 
1 0 Using the isolated proteins and polynucleotides of the invention, the present invention 

further provides methods of obtaining and identifying agents which bind to a polypeptide 
encoded by an ORF corresponding to any of the nucleotide sequences set forth in SEQ ID NO:l- 
1786 and 3573-5358, or bind to a specific domain of the polypeptide encoded by the nucleic 
acid. In detail, said method comprises the steps of: 
1 5 (a) contacting an agent with an isolated protein encoded by an ORF of the present 

invention, or nucleic acid of the invention; and 

(b) determining whether the agent binds to said protein or said nucleic acid. 
In general, therefore, such methods for identifying compounds that bind to a 
polynucleotide of the invention can comprise contacting a compound with a polynucleotide of 
20 the invention for a time sufficient to form a polynucleotide/compound complex, and detecting 
the complex, so that if a polynucleotide/compound complex is detected, a compound that binds 
to a polynucleotide of the invention is identified. 

Likewise, in general, therefore, such methods for identifying compounds that bind to a 
polypeptide of the invention can comprise contacting a compound with a polypeptide of the 
25 invention for a time sufficient to form a polypeptide/compound complex, and detecting the 
complex, so that if a polypeptide/compound complex is detected, a compound that binds to a 
polynucleotide of the invention is identified. 

Methods for identifying compounds that bind to a polypeptide of the invention can also 
comprise contacting a compound with a polypeptide of the invention in a cell for a time 
30 sufficient to form a polypeptide/compound complex, wherein the complex drives expression of a 
receptor gene sequence in the cell, and detecting the complex by detecting reporter gene 
sequence expression, so that if a polypeptide/compound complex is detected, a compound that 
binds a polypeptide of the invention is identified. 

Compounds identified via such methods can include compounds which modulate the 
35 acti^iip^l polypeptide of the invention (that is, increase or decrease its activity, relative to 
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activity observed in the absence of the compound). Alternatively, compounds identified via such 
methods can include compounds which modulate the expression of a polynucleotide of the 
invention (that is, increase or decrease expression relative to expression levels observed in the 
absence of the compound). Compounds, such as compounds identified via the methods of the 

5 invention, can be tested using standard assays well known to those of skill in the art for their 
ability to modulate activity/expression. 

The agents screened in the above assay can be, but are not limited to, peptides, 
carbohydrates, vitamin derivatives, or other pharmaceutical agents. The agents can be selected 
and screened at random or rationally selected or designed using protein modeling techniques. 

1 o For random screening, agents such as peptides, carbohydrates, pharmaceutical agents and 

the like are selected at random and are assayed for their ability to bind to the protein encoded by 
the ORF of the present invention. Alternatively, agents may be rationally selected or designed. 
As used herein, an agent is said to be "rationally selected or designed" when the agent is chosen 
based on the configuration of the particular protein. For example, one skilled in the art can 

15 readily adapt currently available procedures to generate peptides, pharmaceutical agents and the 
like, capable of binding to a specific peptide sequence, in order to generate rationally designed 
antipeptide peptides, for example see Hurby et al., Application of Synthetic Peptides: Antisense 
Peptides," In Synthetic Peptides, A Usefs Guide, W.H. Freeman, NY (1992), pp. 289-307, and 
Kaspczak et al., Biochemistry 28:9230-8 (1989), or pharmaceutical agents, or the like. 

20 In addition to the foregoing, one class of agents of the present invention, as broadly 

described, can be used to control gene expression through binding to one of the ORFs or EMFs 
of the present invention. As described above, such agents can be randomly screened or 
rationally designed/selected. Targeting the ORF or EMF allows a skilled artisan to design 
sequence specific or element specific agents, modulating the expression of either a single ORF or 

25 multiple ORFs which rely on the same EMF for expression control. One class of DNA binding 
agents are agents which contain base residues which hybridize or form a triple helix formation 
by binding to DNA or RNA. Such agents can be based on the classic phosphodiester, 
ribonucleic acid backbone, or can be a variety of sulfhydryl or polymeric derivatives which have 
base attachment capacity. 

30 Agents suitable for use in these methods preferably contain 20 to 40 bases and are 

designed to be complementary to a region of the gene involved in transcription (triple helix - see 
Lee et al., Nucl. Acids Res. 6:3073 (1979); Cooney et al., Science 241:456 (1988); and Dervan et 
al., Science 251 : 1360 (1991)) or to the mRNA itself (antisense - Okano, J. Neurochem. 56:560 
(1991); Oligodeoxynucleotides as Antisense Inhibitors of Gene Expression, CRC Press, Boca 

35 Raton, FL (1988)). Triple helix-formation optimally results in a shut-off of RNA transcription 
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from DNA, while antisense RNA hybridization blocks translation of an mRNA molecule into 
polypeptide. Both techniques have been demonstrated to be effective in model systems. 
Information contained in the sequences of the present invention is necessary for the design of an 
antisense or triple helix oligonucleotide and other DNA binding agents. 
5 Agents which bind to a protein encoded by one of the ORFs of the present invention can 

be used as a diagnostic agent. Agents which bind to a protein encoded by one of the ORFs of the 
present invention can be formulated using known techniques to generate a pharmaceutical 
composition. 



1 0 4.19 USE OF NUCLEIC ACIDS AS PROBES 

Another aspect of the subject invention is to provide for polypeptide-specific nucleic acid 
hybridization probes capable of hybridizing with naturally occurring nucleotide sequences. The 
hybridization probes of the subject invention may be derived from any of the nucleotide 
sequences SEQ ID NO.1-1786 and 3573-5358. Because the corresponding gene is only 
. 15 expressed in a limited number of tissues, a hybridization probe derived from of any of the 
nucleotide sequences SEQ ID NO: 1-1 786 and 3573-5358 can be used as an indicator of the 
presence of RNA of cell type of such a tissue in a sample. 

Any suitable hybridization technique can be employed, such as, for example, in situ 
hybridization. PCR as described in US Patents Nos. 4,683,195 and 4,965,1 88 provides 

20 additional uses for oligonucleotides based upon the nucleotide sequences. Such probes used in 
PCR may be of recombinant origin, may be chemically synthesized, or a mixture of both. The 
probe will comprise a discrete nucleotide sequence for the detection of identical sequences or a 
degenerate pool of possible sequences for identification of closely related genomic sequences. 
Other means for producing specific hybridization probes for nucleic acids include the 

25 cloning of nucleic acid sequences into vectors for the production of mRNA probes. Such vectors 
are known in the art and are commercially available and may be used to synthesize KNA probes 
in vitro by means of the addition of the appropriate RNA polymerase as T7 or SP6 RNA 
polymerase and the appropriate radioactively labeled nucleotides. The nucleotide sequences may 
be used to construct hybridization probes for mapping their respective genomic sequences. The 

30 nucleotide sequence provided herein may be mapped to a chromosome or specific regions of a 
chromosome using well known genetic and/or chromosomal mapping techniques. These 
techniques include in situ hybridization, linkage analysis against known chromosomal markers, 
hybridization screening with libraries or flow-sorted chromosomal preparations specific to 
known chromosomes, and the like. The technique of fluorescent in situ hybridization of 
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chromosome spreads has been described, among other places, in Verma et al (1988) Human 
Chromosomes: A Manual of Basic Techniques, Pergamon Press, New York NY. 

Fluorescent in situ hybridization of chromosomal preparations and other physical 
chromosome mapping techniques may be correlated with additional genetic map data. Examples 
5 of genetic map data can be found in the 1994 Genome Issue of Science (265:1981f). Correlation 
between the location of a nucleic acid on a physical chromosomal map and a specific disease (or 
predisposition to a specific disease) may help delimit the region of DNA associated with that 
genetic disease. The nucleotide sequences of the subject invention may be used to detect 
differences in gene sequences between normal, carrier or affected individuals. 

1 0 4.20 PREPARATION OF SUPPORT BOUND OLIGONUCLEOTIDES 

01igonucleotides,i.e., small nucleic acid segments, may be readily prepared by, for 
example, directly synthesizing the oligonucleotide by chemical means, as is commonly practiced 
using an automated oligonucleotide synthesizer. 

Support bound oligonucleotides may be prepared by any of the methods known to those of 

1 5 skill in the art using any suitable support such as glass, polystyrene or Teflon. One strategy is to 
precisely spot oligonucleotides synthesized by standard synthesizers. Immobilization can be 
achieved using passive adsorption (Inouye & Hondo, (1 990) J. Clin. Microbiol. 28(6) 1469-72); 
using UV light (Nagata et aL, 1 985; Dahlen et al, 1 987; Morrissey & Collins, (1 989) MoL Cell 
Probes 3(2) 1 89-207) or by covalent binding of base modified DNA (Keller et al, 1 988; 1 989); all 

20 references being specifically incorporated herein. 

Another strategy that may be employed is the use of the strong biotin-streptavidin 
interactionas a linker. For example, Broudee/ al (1 994) Proc. Natl. Acad. Sci. USA 91 (8) 3072-6, 
describe the use of biotinylated probes, although these are duplex probes, that are immobilized on 
streptavidin-coated magnetic beads. Streptavidin-coated beads may be purchased from Dynal, Oslo. 

25 Of course, this same linking chemistry is applicable to coating any surface with streptavidin. 

Biotinylated probes may be purchased from various sources, such as, e.g., Operon Technologies 
(Alameda, CA). 

Nunc Laboratories (Naperville, JL) is also selling suitable material that could be used. Nunc 
Laboratories have developed a method by which DNA can be covalently bound to the microwell 
3 0 surface termed Co valink NH. Co vaLink NH is a polystyrene surface grafted with secondary amino 
groups (>NH) that serve as bridge-heads for further covalent coupling. CovaLink Modules may be 
purchased from Nunc Laboratories. DNA molecules may be bound to CovaLink exclusively at the 
5-end by a phosphoramidatebond, allowing immobilization of more than 1 pmol of DNA 
(Rasmussen et al, (1991) Anal. Biochem. 198(1) 138-42). 
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The use of CovaLink NH strips for covalent binding of DNA molecules at the 5*-end has 
been described (Rasmussen et al^ (1991). In this technology, a phosphoramidatebond is employed 
(Chu et al., (1 983) Nucleic Acids Res. 1 1(8) 65 1 3-29). This is beneficial as immobilization using 
only a single covalent bond is preferred. The phosphoramidatebond joins the DNA to the 
5 CovaLink NH secondary amino groups that are positioned at the end of spacer arms covalently 
grafted onto the polystyrene surface through a 2 nm long spacer arm. To link an oligonucleotide to 
CovaLink NH via an phosphoramidatebond, the oligonucleotide terminus must have a 5'-end 
phosphate group. It is, perhaps, even possible for biotin to be covalently bound to CovaLink and 
then streptavi din used to bind the probes. 

1 0 More specifically, the linkage method includes dissolving DNA in water (7.5 ng/ul) and 

denaturing for 1 0 rain, at 95°C and cooling on ice for 1 0 rnin. Ice-cold 0. 1 M 1 -methylimidazole, 
pH 7.0 (1-Melm 7 ), is then added to a final concentration of 1 0 mM 1-Melrn 7 . A ss DNA solution is 
then dispensed into CovaLink NH strips (75 ul/well) standing on ice. 

CartodiimideO.2 M l-ethyl-3^3-<limethylam^ dissolved in 

15 10 mM 1 -Melm7, is made fresh and 25 ul added per well. The strips are incubated for 5 hours at 
50°C. After incubation the strips are washed using, e.g., Nunc-Immuno Wash; first the wells are 
washed 3 times, then they are soaked with washing solution for 5 min., and finally they are washed 
3 times (where in the washing solution is 0.4 N NaOH, 0.25% SDS heated to 50°C). 

It is contemplated that a further suitable method for use with the present invention is that 

20 described in PCT Patent Application WO 90/03382 (Southern & Maskos), incorporated herein by 
reference. This method of preparing an oligonucleotide bound to a support involves attaching a 
nucleoside 3-reagent through the phosphate group by a covalent phosphodiester link to aliphatic 
hydroxyl groups carried by the support. The oligonucleotide is then synthesized on the supported 
nucleoside and protecting groups removed from the synthetic oligonucleotide chain under standard 

25 conditions that do not cleave the oligonucleotide from the support. Suitable reagents include 
nucleoside phosphoramidite and nucleoside hydrogen phosphorate. 

An on-chip strategy for the preparation of DNA probe for the preparation of DNA probe 
arrays may be employed. For example, addressable laser-activated photodeprotectionmay be 
employed in the chemical synthesis of oligonucleotides directly on a glass surface, as described by 

30 Fodor et al (1991) Science 25 1 (4995) 767-73, incorporated herein by reference. Probes may also 
be immobilized on nylon supports as described by Van Ness et al (1 991) Nucleic Acids Res. 
19(12) 3345-50; or linked to Teflon using the method of Duncan & Cavalier (1988) Anal. Biochem. 
169(1) 104-8; all references being specifically incorporated herein. 
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To link an oligonucleotide to a nylon support, as described by Van Ness et al (1991), 
requires activation of the nylon surface via alkylation and selective activation of the 5-amine of 
oligonucleotides with cyanuric chloride. 

One particular way to prepare support bound oligonucleotides is to utilize the 
5 light-generated synthesis described by Pease et ci. t (1 994) PNAS USA 91(11) 5022-6, incorporated 
herein by reference) . These authors used current photolithographic techniques to generate arrays of 
immobilized oligonucleotide probes (DNA chips). These methods, in which light is used to direct 
the synthesis of oligonucleotide probes in high-density, miniaturized arrays, utilize photolabile 
5-protected A^acyl-deoxynucleosidephosphoramidites, surface linker chemistry and versatile 
1 0 combinatorial synthesis strategies. A matrix of 256 spatially defined oligonucleotide probes may be 
generated in this manner. 

4.21 PREPARATION OF NUCLEIC ACID FRAGMENTS 

The nucleic acids may be obtained from any appropriate source, such as cDNAs, genomic 
DNA, chromosomal DNA, microdissected chromosome bands, cosmid or YAC inserts, and RNA, 
1 5 including mRNA without any amplification steps. For example, Sambrook et al (1 989) describes . 
three protocols for the isolation of high molecular weight DNA from mammalian cells (p. 
9.14-9.23). 

DNA fragments may be prepared as clones in M 1 3 , plasmid or lambda vectors and/ or 
prepared directly from genomic DNA or cDN A by PCR or other amplification methods. Samples 

20 may be prepared or dispensed in multiwell plates. About 1 00- 1 000 ng of DNA samples may be 
prepared in 2-500 ml of final volume. 

The nucleic acids would then be fragmented by any of the methods known to those of skill 
in the art including, for example, using restriction enzymes as described at 9.24-9.28 of Sambrook et 
al (1 989), shearing by ultrasound and NaOH treatment. 

25 Low pressure shearing is also appropriate, as described by Schriefer et al (1 990) Nucleic 

Acids Res. 1 8(24) 7455-6, incorporated herein by reference). In this method, DNA samples are 
passed through a small French pressure cell at a variety of low to intermediate pressures. A lever 
device allows controlled application of low to intermediate pressures to the cell. The results of these 
studies indicate that low-pressure shearing is a useful alternative to sonic and enzymatic DNA 

30 fragmentationmethods. 

One particularly suitable way for fragmenting DNA is contemplated to be that using the two 
base recognition endonuclease, Cv/JI, described by Fitzgerald et al (1 992) Nucleic Acids Res. 
20(14) 3753-62. These authors described an approach for the rapid fragmentation and fractionation 
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of DNA into particular sizes that they contemplated to be suitable for shotgun cloning and 
sequencing. 

The restriction endonuclease CvzJI normally cleaves the recognition sequence PuGCPy 
between the G and C to leave blunt ends. Atypical reaction conditions, which alter the specificity of 
5 this enzyme (Cv/JI**), yield a quasi-random distribution of DNA fragments form the small 
moleculepUC19 (2688 base pairs). Fitzgerald etaL (1992) quantitatively evaluated the 
randomness of this fragmentation strategy, using a Cv/Jl** digest ofpUC19that was size 
fractionated by a rapid gel filtration method and directly ligated, without end repair, to a lac Z minus 
M 1 3 cloning vector. Sequence analysis of 76 clones showed that CvzJI* * restricts py GCPy and 
1 0 PuGCPu, in addition to PuGCPy sites, and that new sequence data is accumulated at a rate 
consistent with random fragmentation. 

As reported in the literature, advantages of this approach compared to sonication and 
agarose gel fractionation include: smaller amounts of DNA are required (0.2-0.5 ug instead of 2-5 
ug); and fewer steps are involved (no preligation, end repair, chemical extraction, or agarose gel 
1 5 electrophoresis and elution are needed 

Irrespective of the manner in which the nucleic acid fragments are obtained or prepared, it is 
importantto denature the DNA to give single stranded pieces available for hybridization. This is 
achieved by incubating the DNA solution for 2-5 minutes at 80-90°C . The solution is then cooled 
quickly to 2°C to prevent renaturation of the DNA fragments before they are contacted with the 
20 chip. Phosphate groups must also be removed from genomic DNA by methods known in the art 

4.22 PREPARATION OF DNA ARRAYS 

Arrays may be prepared by spotting DNA samples on a support such as a nylon membrane. 
Spotting may be performed by using arrays of metal pins (the positions of which correspond to an 
array of wells in a microtiterplate) to repeated by transfer of about 20 nl of a DNA solution to a 

25 nylon membrane. By offset printing, a density of dots higher than the density of the wells is 

achieved. One to 25 dots may be accommodated in 1 mm 2 , depending on the type of label used. By 
avoiding spotting in some preselected number of rows and columns, separate subsets (subarrays) 
may be formed. Samples in one subarray may be the same genomic segment of DNA (or the same 
gene) from different individuals, or may be different, overlapped genomic clones. Each of the 

3 0 subarrays may represent replica spotting of the same samples. In one example, a selected gene 

segment may be amplified from 64 patients. For each patient, the amplified gene segment may be in 
one 96- well plate (all 96 wells containing the same sample). A plate for each of the 64 patients is 
prepared. By using a 96-pin device, all samples may be spotted on one 8x12 cm membrane. 

95 



BMSOOCID: <WO 



.0153312A1_L» 



f 

I 



WO 01/53312 PCT/USOO/34263 

Subarraysmay contain 64 samples, one from each patient. Where the 96 subarrays are identical, the 
dot span may be 1 mm 2 and there may be a 1 mm space between subarrays. 

Another approach is to use membranes or plates (available from NUNC, Naperville, Illinois) 
which may be partitioned by physical spacers e.g. a plastic grid molded over the membrane, the grid 
5 being similar to the sort of membrane applied to the bottom of multi well plates, or hydrophobic 
strips. A fixed physical spacer is not preferred for imaging by exposure to flat phosphor-storage 
screens or x-ray films. 

The present invention is illustrated in the following examples. Upon consideration of the 
present disclosure, one of skill in the art will appreciate that many other embodiments and variations 
1 0 may be made in the scope of the present invention. Accordingly, it is intended that the broader 
aspects of the present invention not be limited to the disclosure of the following examples. The 
present invention is not to be limited in scope by the exemplified embodiments which are intended 
as illustrations of single aspects of the invention, and compositions and methods which are 
functionally equivalent are within the scope of the invention. Indeed, numerous modifications and 
1 5 variations in the practice of the invention are expected to occur to those skilled in the art upon 
consideration of the present preferred embodiments. Consequently, the only limitations which 
should be placed upon the scope of the invention are those which appear in the appended claims. 

All references cited within the body of the instant specification are hereby incorporated by 
reference in their entirety. 

20 5.0 EXAMPLES 

5.1.1 EXAMPLE 1 

Novel Nucleic Acid Sequences Obtained From Various Libraries 

A plurality of novel nucleic acids were obtained from cDNA libraries prepared from various 
human tissues and in some cases isolated from a genomic library derived from human chromosome 

25 using standard PCR, SBH sequence signature analysis and Sanger sequencing techniques. The 

inserts of the library were amplified with PCR using primers specific for the vector sequences which 
flank the inserts. Clones from cDN A libraries were spotted on nylon membrane filters and screened 
with oligonucleotide probes (e.g., 7-mers) to obtain signature sequences. The clones were clustered 
into groups of similar or identical sequences. Representative clones were selected for sequencing. 

30 In some cases, the 5* sequence of the amplified inserts was then deduced using a typical 

Sanger sequencing protocol. PCR products were purified and subjected to fluorescent dye 
terminator cycle sequencing. Single pass gel sequencing was done using a 377 Applied Biosystems 
(ABI) sequencer to obtain the novel nucleic acid sequences. In some cases RACE (Random 
Amplification of cDNA Ends) was performed to further extend the sequence in the 5' direction. 
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5.1.2 EXAMPLE 2 

Assemblage of Novel Nucleic Acids 

The contigs or nucleic acids of the present invention, designated as SEQ ID NO: 3573-5358 
were assembled using an EST sequence as a seed. Then a recursive algorithm was used to extend 
the seed EST into an extended assemblage, by pulling additional sequences from different databases 
(i.e., Hyseq' s database containing EST sequences, dbEST version 1 14, gb pri 1 1 4, and UniGene 
version 101) that belong to this assemblage. The algorithm terminated when there was no 
additional sequences from the above databases that would extend the assemblage. Inclusion of 
component sequences into the assemblage was based on a BLASTN hit to the extending assemblage 
with BLAST score greater than 300 and percent identity greater than 95%. 

Apolypeptide was predicted to be encoded by eachofSEQ ID NO:3573-5358 as set forth 
below. The polypeptides was predicted using a software program called FASTY (available from 
htt p^faste hioch.virginia.edu) which selects a polypeptides based on a comparison of translated 
novel polynucleotideto known polynucleotides (W.R. Pearson, Methods in Enzymology, 183:63-98 
(1990),herein incorporated by reference. The predicted polypeptides are shown in Table 7. 

5.2.2 EXAMPLE 3 
Novel Nncleic Acids 

Using PHRAP (Univ. of Washington) or CAP4 (Paracel), a full length gene cDNA 
sequence and its corresponding protein sequence were generated from the assemblage. Any frame 
shifts and incorrect stop codons were correctedby hand editing. During editing, the sequence was 
checked using FASTY and/or BLAST against Genebank. Other computer programs which may 
have been used in the editing process were phredPhrap and Consed (University of Washington) and 
ed-ready, ed-ext and gc-zip-2 (Hyseq, Inc.). The full-length nucleotide, including splice variants 
resulting from these procedures are shown in the Sequence Listing as SEQ ID NOS: 1- 327. 

Table 1 shows the various tissue sources of SEQ ID NO: 1-327. 

The nearest neighbor results for SEQ ID NO: 1-327 were obtained by a FASTA version 3 
search against Genpept release 1 17, using FASTXY algorithm. FASTXY is an improved 
version of FASTA alignment which allows in-codon frame shifts. The nearest neighbor result 
showed the closest homologue for SEQ ID NO: 1-327 from Genpept . The translated amino acid 
sequences for which the nucleic acid sequence encodes are shown in the Sequence Listing. The 
nearest neighbor results for SEQ ID NO: 1-327 are shown in Table 2 below. 

Using eMatrix software package (Stanford University, Stanford, CA) (Wu et al., J. Comp. 
Biol., Vol. 6 pp. 219-235 (1999) herein incorporated by reference), all the sequences were 
examined to determine whether they had identifiable signature regions. Table 3 shows the 
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signature region found in the indicated polypeptide sequences, the description of the signature, 
the eMatrix p-value(s) and the positions) of the signature within the polypeptide sequence. 

Using the pFam software program (Sonnhammer et al., Nucleic Acids Res., Vol. 26(1) 
pp. 320-322 (1998) herein incorporated by reference) all the polypeptide sequences were 
examined for domains with homology to certain peptide domains. Table 4 shows the name of 
the domain found, the description, the p-value and the pFam score for the identified domain 
within the sequence. 

The nucleotide sequence within the sequences that codes for signal peptide sequences and 
their cleavage sites can be determine from using Neural Network SignalP VI. 1 program (from 
Center for Biological Sequence Analysis, The Technical University of Denmark). The process 
for identifying prokaryotic and eukaryotic signal peptides and their cleavage sites are also 
disclosed by Henrik Nielson, Jacob Engelbrecht, Soren Brunak, and Gunnar von Heijne in the 
publication " Identification of prokaryotic and eukaryotic signal peptides and prediction of their 
cleavage sites" Protein Engineering, Vol. 10, no. 1, pp. 1-6 (1997), incorporated herein by 
reference. A maximum S score and a mean S score, as described' in the Nielson et as reference, 
was obtained for the polypeptide sequences. Table 5 shows the position of the signal peptide in 
each of the polypeptides and the maximum score and mean score associated with that signal 
peptide. 

5.3.2 EXAMPLE 4 
Novel Nucleic Acids 

Using PHRAP (Univ. of Washington) or CAP4 (Paracel), a full length gene cDNA 
sequence and its corresponding protein sequence were generated from the assemblage. Any frame 
shifts and incorrect stop codons were corrected by hand editing. During editing, the sequence was 
checked using FASTY and/or BLAST against Genbank (i.e., dbEST version 1 17, gb pri 1 1 7, 
UniGene version 117, Genpept release 117). Other computer programs which may have been used 
in the editing process were phredPhrap and Consed (University of Washington) and ed-ready, ed- 
ext and gc-zip-2 (Hyseq, Inc.). The full-length nucleotide, including splice variants resulting from 
these procedures are shown in the Sequence Listing as SEQ ID NOS: 328-1 41 3. 

Table 1 shows the various tissue sources of SEQ ID NO: 328-1413. 

The nearest neighbor results for SEQ ID NO: 328-1413 were obtained by a BLASTP 
version 2.0al 19MP-WashU search against Genpept release 1 1 8, using BLAST algorithm. The 
nearest neighbor result showed the closest homologue for SEQ ID NO: 328-1413 from Genpept 
The translated amino acid sequences for which the nucleic acid sequence encodes are shown in 
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the Sequence Listing. The nearest neighbor results for SEQ ID NO: 328-1413 are shown in : 
Table 2 below. 

Using eMatrix software package (Stanford University, Stanford, CA) (Wu et aL, J. Comp. 

Biol., Vol. 6 pp. 2 1 9-235 ( 1 999) herein incorporated by reference), all the sequences were 
5 examined to determine whether they had identifiable signature regions. Table 3 shows the 

signature region found in the indicated polypeptide sequences, the description of the signature, 

the eMatrix p-value(s) and the positions) of the signature within the polypeptide sequence. 

Using the pFam software program (Sonnhammer et aL, Nucleic Acids Res., Vol. 26(1) 

pp. 320-322 (1998) herein incorporated by reference) all the polypeptide sequences were 
10 - examined for domains with homology to certain peptide domains. Table 4 shows the name of 

the domain found, the description, the p-value and the pFam score for the identified domain 

within the sequence. 

The nucleotide sequence within the sequences that codes for signal peptide sequences and 
their cleavage sites can be determine from using Neural Network SignalP VI. 1 program (from 

1 5 Center for Biological Sequence Analysis, The Technical University of Denmark). The process 
for identifying prokaryotic and eukaryotic signal peptides and their cleavage sites are also 
disclosed by Henrik Nielson, Jacob Engelbrecht, Soren Brunak, and Gunnar von Heijne in the 
publication " Identification of prokaryotic and eukaryotic signal peptides and prediction of then- 
cleavage sites" Protein Engineering, Vol. 10, no. 1, pp. 1-6 (1997), incorporated herein by 

20 reference. A maximum S score and a mean S score, as described in the Nielson et as reference, 
was obtained for the polypeptide sequences.- Table 5 shows the position of the signal peptide in 
each of the polypeptides and the maximum score and mean score associated with that signal 
peptide. 



25 53.2 EXAMPLE 5 

Novel Nucleic Acids 

Using PHRAP (Univ. of Washington) or CAP4 (Paracel), a full length gene cDNA 
sequence and its corresponding protein sequence were generated from the assemblage. Any frame 
shifts and incorrect stop codons were corrected by hand editing. During editing, the sequence was 
30 checked using FASTY and/or BLAST against Genbank (i.e., dbEST version 117, gbpri 117, 

UniGene version 1 1 7, Genpept release 1 17). Other computer programs which may have been used 
in the editing process were phredPhrap and Consed (University of Washington) and ed-ready, ed- 
ext and gc-zip-2 (Hyseq, Inc.). The full-length nucleotide sequences, including splice variants 
resulting from these procedures are shown in the Sequence Listing as SEQ ID NOS: 1 4 14-1 652. 
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Table 1 shows the various tissue sources of SEQ ID NO: 1414-1 652. 

The nearest neighbor results for SEQ ID NO: 1414-1652 were obtained by a BLASTP 
version 2.0al 19MP-WashU search against Genpept release 1 1 8, using BLAST algorithm. The 
nearest neighbor result showed the closest homologue for SEQ ID NO: 1414-1652 from 
5 Genpept. The translated amino acid sequences for which the nucleic acid sequence encodes are 
shown in the Sequence Listing. The nearest neighbor results for SEQ ID NO: 1414-1652 are 

shown in Table 2 below. 

Using eMatrix software package (Stanford University, Stanford, CA) (Wu et al., J. Comp. 

Biol., Vol. 6 pp. 219-235 (1999) herein incorporated by reference), all the sequences were 
10 examined to determine whether they had identifiable signature regions. Table 3 shows the 

signature region found in the indicated polypeptide sequences, the description of the signature, 

the eMatrix p-value(s) and the position(s) of the signature within the polypeptide sequence. 

Using the pFam software program (Sonnhammer et al., Nucleic Acids Res., Vol. 26(1) 

pp. 320-322 (1998) herein incorporated by reference) all the polypeptide sequences were 
1 5 examined for domains with homology to certain peptide domains. Table 4 shows the name of 

the domain found, the description, the p-value and the pFam score for the identified domain 

within the sequence. 

The nucleotide sequence within the sequences that codes for signal peptide sequences and 
their cleavage sites can be determine from using Neural Network SignalP VI .1 program (from 

20 Center for Biological Sequence Analysis, The Technical University of Denmark). The process 
for identifying prokaryotic and eukaryotic signal peptides and their cleavage sites are also 
disclosed by Henrik Nielson, Jacob Engelbrecht, Soren Brunak, and Gunnar von Heijne in the 
publication " Identification of prokaryotic and eukaryotic signal peptides and prediction of their 
cleavage sites" Protein Engineering, Vol. 10, no. 1, pp. 1-6 (1997), incorporated herein by 

25 reference. A maximum S score and a mean S score, as described in the Nielson et as reference, 
was obtained for the polypeptide sequences. Table 5 shows the position of the signal peptide in 
each of the polypeptides and the maximum score and mean score associated with that signal 
peptide. 

5.43. EXAMPLE 6 
30 Novel Nucleic Acids 

Using PHRAP (Univ. of Washington) or CAP4 (Paracel), a full length gene cDNA 
sequence and its corresponding protein sequence were generated from the assemblage. Any frame 
shifts and incorrect stop codons were corrected by hand editing. During editing, the sequence was 
checked using FASTY and/or BLAST against Genbank (i.e., dbEST version 118, gbpri 118, 
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UniGene version 118, Genpept release 1 1 8). Other computer programs which may have been used 
in the editing process were phredPhrap and Consed (University of Washington) and ed-ready, ed- 
ext and gc-zip-2 (Hyseq, Inc.). The full-length nucleotide sequences, including splice variants 
resulting from these procedures are shown in the Sequence Listing as SEQ ID NOS : 1653-1 745 . 
5 Table 1 shows the varioustissue sources of SEQ ED NO: 1653-1745. 

The homology for SEQ ID NO: 1653-1745 were obtained by a BLASTP version 2.0al 
19MP-WashU search against Genpept release 118, using BLAST algorithm. The results showed 
homologues for SEQ ID NO: 1653-1745 from Genpept. The translated amino acid sequences for 
which the nucleic acid sequence encodes are shown in the Sequence Listing. The homologues 
10 with identifiable functions for SEQ ID NO: 1653-1745 are shown in Table 2 below. 

Using eMatrix software package (Stanford University, Stanford, CA) (Wu et al., J. Comp. 
Biol., Vol. 6 pp. 219-235 (1999) herein incorporated by reference), all the sequences were 
examined to determine whether they had identifiable signature regions. Table 3 shows the 
signature region found in the indicated polypeptide sequences, the description of the signature, 
15 the eMatrix p-value(s) and the position(s) of the signature within the polypeptide sequence. 

Using the pFam software program (Sonnhammer et al., Nucleic Acids Res., Vol. 26(1) 
pp. 320-322 (1998) herein incorporated by reference) all the polypeptide sequences were 
examined for domains with homology to certain peptide domains. Table 4 shows the name of 
the domain found, the description, the p-value and the pFam score for the identified domain 

20 within the sequence. 

The nucleotide sequence within the sequences that codes for signal peptide sequences and 
their cleavage sites can be determine from using Neural Network SignalP VI .1 program (from 
Center for Biological Sequence Analysis, The Technical University of Denmark). The process 
for identifying prokaryotic and eukaryotic signal peptides and their cleavage sites are also 

25 disclosed by Henrik Nielson, Jacob Engelbrecht, Soren Brunak, and Gunnar von Heijne in the 
publication " Identification of prokaryotic and eukaryotic signal peptides and prediction of their 
cleavage sites" Protein Engineering, Vol. 10, no. 1, pp. 1-6 (1997), incorporated herein by 
reference. A maximum S score and a mean S score, as described in the Nielson et as reference, 
was obtained for the polypeptide sequences. Table 5 shows the position of the signal peptide in 

30 each of the polypeptides and the maximum score and mean score associated with that signal 
peptide. 

5.5-2 EXAMPLE 7 
Novel Nucleic Acids 
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Using PHRAP (Univ. of Washington) or CAP4 (Paracel), a full length gene cDN A 
sequence and its corresponding protein sequence were generated from the assemblage. Any frame 
shifts and incorrect stop codons were corrected by hand editing. During editing, the sequence was 
checked usingFASTY and/or BLAST against Genbank (i.e., dbEST version 119, gbpri 119, 
UniGene version 1 1 9, Genpept release 119). Other computer programs which may have been used 
in the editing process were phredPhrap and Consed (University of Washington) and ed-ready, ed- 
ext and gc-zip-2 (Hyseq, Inc.). The full-length nucleotide, including splice variants resulting from 
these procedures are shown in the Sequence Listing as SEQ ID NOS: 1746-1768. 

Table 1 shows the various tissue sources of SEQ ID NO: 1 746-1 768. 

The homology for SEQ ID NO: 1 746-1 768 were obtained by a BLASTP version 2.0al 
19MP-WashU search against Genpept release 119, using BLAST algorithm. The results showed 
homologues for SEQ ID NO: 1746-1768 from Genpept. The translated amino acid sequences for 
which the nucleic acid sequence encodes are shown in the Sequence Listing. The homologues 
with identifiable functions for SEQ ID NO: 1746-1768 are shown in Table 2 below. 

Using eMatrix software package (Stanford University, Stanford, CA) (Wu et al., J. Comp. 
Biol., Vol. 6 pp. 21 9-235 (1 999) herein incorporatedby reference), all the sequences were examined 
to determine whether they had identifiable signature regions. Table 3 shows the signature region 
found in the indicated polypeptide sequences, the description of the signature, the eMatrix p- 
value(s) and the positions) of the signature within the polypeptide sequence. 

Using the PFam software program (Sonnhammer et al., Nucleic Acids Res., Vol. 26(1) 
pp. 320-322 (1998) herein incorporated by reference) all the polypeptide sequences were 
examined for domains with homology to certain peptide domains. Table 4 shows the name of 
the domain found, the description, the p-value and the PFam score for the identified domain 
within the sequence. 

The nucleotide sequence within the sequences that codes for signal peptide sequences and 
their cleavage sites can be determine from using Neural Network SignalP VI. 1 program (from 
Center for Biological Sequence Analysis, The Technical University of Denmark). The process for 
identifying prokaryotic and eukaryotic signal peptides and their cleavage sites are also disclosed by 
Henrik Nielson, Jacob Bngelbrecht, Soren Brunak, and Gunnar von Heijne in the publication 64 
Identification of prokaryotic and eukaryotic signal peptides and prediction of their cleavage sites'* 
Protein Engineering, Vol. 1 0, no. 1 , pp. 1 -6 ( 1 997), incorporated herein by reference. A maximum 
S score and a mean S score, as described in the Nielson et as reference, was obtained for the 
polypeptide sequences. Table 5 shows the position of the signal peptide in each of the polypeptides 
and the maximum score and mean score associated with that signal peptide. 
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5-6.2 EXAMPLE 8 
Novel Nucleic Acids 

Using PHRAP (Univ. of Washington) or CAP4 (Paracel), a full length gene cDNA 
sequence and its corresponding protein sequence were generated from the assemblage. Any frame 
5 shifts and incorrect stop codons were corrected by hand editing. During editing, the sequence was 
checked using FASTY and/or BLAST against Genbank (i.e., dbEST version 1 20, gb pri 120, 
UniGene version 1 20, Genpept release 120). Other computer programs which may have been used 
in the editing process were phredPhrap and Consed (University of Washington) and ed-ready , ed- 
ext and gc-zip-2 (Hyseq, Inc.) . The translated amino acid sequences for which the nucleic acid 
1 0 sequence encodes are shown in the Sequence Listing. The full-length nucleotide, including splice 
variants resulting from these procedures are shown in the Sequence Listing as SEQ ID NOS : 1 769- 
1786. 

Table 1 shows the various tissue sources of SEQ ID NO: 1769-1786. 
The homology for SEQ ID NO: 1769-1786 were obtained by a BLASTP version 2.0al 
15 1 9MP-WashU search against Genpept release 120 and the amino acid version of Geneseq 
released on October 26, 2000, using BLAST algorithm. Hie results showed homologues for 
SEQ ID NO: 1769-1786 from Genpept The homologues with identifiable functions for SEQ ID 
NO: 1769-1786 are shown in Table 2 below. 

Using eMatrix software package (Stanford University, Stanford, CA) (Wu et ah, J. Comp. 
20 Biol., Vol. 6 pp. 2 1 9-235 ( 1 999) herein incorporated by reference), all the sequences were 
examined to determine whether they had identifiable signature regions. Table 3 shows the 
signature region found in the indicated polypeptide sequences, the description of the signature, 
the eMatrix p-value(s) and the position(s) of the signature within the polypeptide sequence. 

Using the pFam software program (Sonnhammer et al., Nucleic Acids Res., Vol. 26(2) 
25 pp. 320-322 (1 998) herein incorporated by reference) all the polypeptide sequences were 

examined for domains with homology to certain peptide domains. Table 4 shows the name of 
the domain found, the description, the p-value and the pFam score for the identified domain 
within the sequence. 

The nucleotide sequence within the sequences that codes for signal peptide sequences and 
30 their cleavage sites can be determine from using Neural Network SignalP VI .1 program (from 
Center for Biological Sequence Analysis, The Technical University of Denmark). The process 
for identifying prokaryotic and eukaryotic signal peptides and their cleavage sites are also 
disclosed by Henrik Nielson, Jacob Engelbrecht, Soren Brunak, and Gunnar von Heijne in the 
publication " Identification of prokaryotic and eukaryotic signal peptides and prediction of their 
3 5 cleavage sites" Protein Engineering, Vol. 1 0, no. 1 , pp. 1 -6 (1 997), incorporated herein by 
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reference. A maximum S score and a mean S score, as described in the Nielson et as reference, 
was obtained for the polypeptide sequences. Table 5 shows the position of the signal peptide in 
each of the polypeptides and the maximum score and mean score associated with that signal 
peptide. 

Table 6 is a correlation table of all of the sequences and the SEQ ID NOS. 
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TABLE 1 

Tissue Origin 



adult brain 



RNA Source 



GIBCO 



Hyseq 
Library Name 



&B30O1 



SEQ ID NOS 



9 19-21 50-51 65-66 72 78 80 82 
85 87 107-108 113 116 123 138 



140 150-152 159 
202-203 212-214 
251 258 268-269 
298 301 321 326 
357 362 369 
443 459-460 
500 503 519 
608-609 613 
652 657-658 
695 697 710 
796 804 811 
900 912 919 
962 979 988-989 996 
1039 1047 



169 177 192-193 
225-226 235-236 
272 280-281 295 
331-332 334 



379 
473 
526 
618 
660 
715 



1008 


1018 


1067 


1070 


1116- 


1117 


1149 


1151 


1234 


1241 


1279 


1288 


1312 


1320 


1361 


1368 


1400 


1417 


1494 


1501 


1517 


1522 


1549 


1565 


1623 


1625 


1649 


1653 


1734 


1741 


1771 





382-383 
475 477 
547 574 
633-634 
669-671 
724 731 
857-859 862 
922 924-929 
1001 
1059 

1078 1082 1107 
1131 1134-1137 
1157 1180 1206 
1243 1258 
1290 1294 
1323 1330 



356- 
423 
496 
587 



1373-1375 
1446 1468 
1503 
1524 
1578 
1627 1639 
1664 1667 



1379 
1482 
1S06-1507 
1530-1533 
1598 1606 



416 
488 
582 

645-646 
678 687 
775-777 
869 899- 
933 936 
1004- 
1064 
1113 
1140 
1229 
1272-1273 
1307-1308 
1356 1360- 
1391 
1493- 
1512 
1537 
1608 
1643 1648- 
1671 1696 



1743-1744 1760-1761 



adult brain 



GIBCO 



ABD003 



3 12-14 18-19 25 30-31 34-36 43- 
45 50-51 56 58 60 65-66 68-69 80 
82 85 87 92 104 107-108 
115-116 123-124 131-132 
139 142 146 148-145 152 



159 163 165 167 169 
193 196-197 199 203 
214 223 233 235-237 
261 268-269 272 276 
288 291-292 295 297 
307 317 320-321 323 



112-113 
135-137 
154 157 
172 180 192- 
208 210 212- 
247 257 259 
280-281 284- 
300-301 304 
327 329-331 
379-381 
426-426 
449 



291-292 295 
317 320-321 
333-334 345-349 356-357 
393 401 408 414 419 424 
430 433-436 438-439 443 445 
453-454 459-461 468 471-473 
478 483 491 494 496 500 503 
508 516 519-520 525-527 534 
540 542-543 545 553 555 560 
570 574-576 586-588 593 595 
601 606-609 616-620 622-623 
628-633 635-636 643 645-649 
655-656 660-665 
687 701 710 715 
743 745-746 750 
773 775-778 786 
802-803 810-811 
832 834-836 840 

874 
908 
929 



861 864 869 
902 904-905 
922 924-927 
941-942 945 
977 979-980 



955-958 
985-986 



668-670 676 
717 724-728 
753 759 
789 796 
815 817 
845-847 
878 883 
911-914 
932-534 
963 
990 



476- 

507- 

536- 

569- 

597 

625 

653 

681 

735 



765-766 
799-800 
820-821 
851 858- 
897 901- 
916 921- 
936-939 
966-969 
992-993 



997-1001 1O05-1O07 1012 1017- 
1020 1023-1024 1029-1031 1034 
1036 1039 1050 1059 1063-1066 
1078 1081-1082 1085-1086 1089 
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Tissue 


Origin 


RNA Source 


Hyaeq 






SBQ 


ID NOS: 










Library Name 






















1097 


1103 


1107 


1109 


1112 


1116- 










1117 


1119 


1121 


1124 


1127 


1130 










1134 


1144- 


1145 


1149 


1151 


1157- 










11S8 


1167 


1170 


1178 


1184 


1188 










1190 


1193- 


1194 


1200 


1202 


1215- 










1217 


1220 


1226- 


1227 


1229 


1231 










1241 


1243 


1247 


1252 


1258 


1263 










1267 


1269 


1279 


1281 


1284 


1286- 










1289 


1293- 


1294 


1306- 


•1307 


1312 










1316 


-1320 


1326 


1333 


1338 


1341 










1344 


1348 


1351 


1355- 


•1357 


1368 










1374 


1377 


1380 


1386 


1389 


-1390 










1394 


1400 


1409 


1414 


1422 


-1423 










1425 


-1427 


1437 


1443 


1446 


1454 










1456 


1458- 


1459 


1468 


1470 


-1472 










1478 


1482- 


1483 


1487- 


•1488 


1493 










1497 


1499 


1506 


1508- 


■1511 


1517 










1522 


-1524 


1530- 


1533 


1545 


-1546 










1548 


-1550 


1552 


1557- 


•1559 


• 1563 










1565 


1567 


1569 


1571 


1586 


1588 










1591 


1593 


1595 


1598- 


•1601 


1608 










1611 


1620- 


1621 


1624- 


•1626 


1628 










1630 


-1632 


1636 


1640- 


•1641 


1644- 










1645 


1647 


1649 


1653- 


•1655 


1657 










1664 


1667 


1669 


1673 


1678 


-1681 










1686 


1690 


1694- 


1696 


1701 


1709 










1711 


1719 


1722- 


1723 


1726 


-1727 










1731 


-1733 


1738 


1740 


1743 


-1744 










1747 


1749 


1753 


1757- 


•17S8 


1760- 










1761 


1765 


1771 


1785 






adult 


brain 


Clontech 


ABR001 


9 29 


68-69 113 


115 146 1 


52 206 










223 


245 277 307 320 


324 


330-331 










344 


34 8 352 362 


379 


384 


393 404 










408 


414 44 


1-442 


454 


469 


481 490 










506 


517 586 597 631 


641 


659 691 










715 


799 803 833 


865 


871 


875 880 










882 


908 920 937 


1000 1005-1006 










1027 


1036 


1041 


1043 


1075 


1107 










1112 


1121 


1127 


1136- 


1137 


1144- 










1147 


1231 


1238- 


1239 


1280 


1293 










1320 


1345 


1355 


1361 


1383 


-1384 










1400 


1417 


1448 


1456 


1476 


1507 










1570 


1572 


1609- 


1610 


1614 


1620 








■ 


1626 


1645 


1653 


1754 


1759 


1770 










1786 












adult 


brain 


Clontech 


ABR006 


5-8 


15-16 


168 212-213 271 278 










280- 


281 291-292 


300-301 


310 314 




■ 






321 326 336-338 


341 


352 357 359- 










360 


362 369 374 


379 


384 


393 396- 










397 


414 419-420 


426- 


428 


430 441- 










442 < 


453 506 616 


-617 


661 * 


589 785 










798 


845 1018 1109 1113 1124 1148 










1167 


1187 


1207 


1227 


1252 


1265 










1285 


1312 


1317- 


1319 


1324 


-1327 










1344 


1369 


1381 


1400 


1416 


1421 










1427 


1430- 


1431 


1436 


1471 


1501 








■ 


1557 


-1559 


1586 


1588 


1651 


16S3 










1664 


-1665 


1671 


1673 


1690 


1697- 










1698 


1700 


1711 


1717 


1719 


-1720 










1728 


1736 


1740 


1743- 


1744 


1757 










1760-1761 










adult 


brain 


Clontech 


ABR00 8 


5-10 


13-19 


22-23 25 


29 3! 


3 37-39 










43-45 50-51 54- 


55 57 


-53 60-66 










68-70 72 75 77- 


80 83 


85 1 


39-92 94 










99-105 108 


-110 


112-113 116-117 










123 128 133 135 


-137 


139 143 145- 










146 148 152 154 


-155 


157 166 168- 










172 174-175 181 


-184 


188-190 193- 










194 196 198-200 


202 


204-205 207- 
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Tissue Orxgin 



RNA Source 



Hyseq 
Library Name 



SHQ ID NOS 



208 210 214-21S 218 
231-232 234-241 245 
255 257-259 268-269 
285-286 288 
307 309-311 
325-326 
344-347 
369-373 
390-3S1 



322 
341 
362 
387 



290-292 
313 315 
328 330 
349 352 
376 379 
393-394 
405-411 414-415 417- 
437-438 440-444 4S3 
467 469-471 476 478 
491 497 503 506-513 
524-526 528-530 532 
542 544 547-551 553 
572-574 577 581 585 
591 597 599 601-602 
615-617 619-620 622 
631 633-634 636-641 
651-653 655-664 669 
689 691-700 
720-721 725 
746 750-752 
766 768 773 
787-789 794 
811 814-815 
839-840 842 
865 867-872 
887 889-892 
904 908 910 
921-924 926-927 
943 945 949 953 
967 969 971 975 
996-990 992 997 
1004-1006 1008 1012 
1027 1029-1031 1035 
1048 1053 1057 1059 
1070 1072-1075 1077 
1085-1093 1095-1096 
1114-1125 1127 1131 
1138 1142-1145 1148 
1163 1167 1169 1172 
1180 1183-1188 1191 
1200 1204 1206 1211 
1222-1223 1226-1227 
1234-1235 1241-1242 
1266 1269-1271 1276 
1281 1284-1286 1292 
1299 1305-1309 1312 
1319 1322 1324-1327 



662 687 
715-717 
742-743 
762-764 
734-785 
803 805 
834-837 
861-862 
883-884 
898 901 
919 
941 
963 
986 



1334-1335 
1354-1355 
1369-1370 
1381-1384 
1396-1397 
1414 1419-1420 
1435 1437-1438 
1448 1453-1455 
1464 1466 1468 
1482-1483 
1509 1513 
1536 1547 
1574 1578 
1601-1602 



1339 1344 
1357-1358 
1373-1374 
1386-1388 
1400 1403 



1496 



1423 
1440 
1457 
1471 
1502 



1519-1520 
1549-1552 
1586-1589 
1605 1607 



1617 1619-1621 1623 
1635-1641 1643-1645 
1653 1656-1658 1664 
1674 1676-1684 1686 
1694-1696 1704-1705 



221-226 229 

-247 251-253 
271 276-281 
300-302 304 
317-318 320- 

-331 333-338 
354 356-357 

-380 382 384 
397 399-403 

-420 426-428 

-455 462 464 
482-484 488- 
516-517 520 

-534 537-540 
561 565-567 
587-588 590« 
606-610 612 

-623 628-629 
643 645-647 

-671 673 679 
702 706 710 

-734 736-739 
756 758-759 

-778 780-782 
796 799 802- 
818 825-826 

-843 856-859 
874-875 881 
894-B95 897 
912 914 917 
930-932 935 

-954 958 961 
977 981-983 
999-1002 
1018-1023 

-1037 1047- 
1063 1068 
1081-1083 
1108-1112 

-1133 1135- 

-1158 1160- 
1175 1177 

-1195 1199- 
1213-1216 
1229-1231 
1244-1263 

-1277 1279- 
1294-12 95 
1314 1316- 
1330 1332 

-1346 1351 
1365-1367 
1376-1379 
1392 13 94 

-1407 1410 
1432-1433 

-1442 1446 
1461 1463- 
1477 1480 

-1504 1507- 
1524-1526 
1567 1573- 
1597-1598 

-1609 1611- 
1625-1626 
1649 1651 
1669 1671- 
1639-1690 
1708-1709 
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Tissue Oragin 



RNA Source 



Hyseq 
Library Name 



SEQ ID NOS: 



1720-1724 1726-1728 1730-1733 
1737-1740 1742-1745 1753 1756- 
1757 1759-1761 1765 1767 1771- 
1772 1776-1777 1779-1780 1786 



adult brain 



Clontech 



ABR011 



24 75 103 186 210 310-311 364- 
365 508 623 710 937 1002-1003 
10S9 1204 1609 1731-1732 



adult brain 



BioChain 



ABR012 



46 182-184 204-205 300 739 767 
1371 1549 1620 1684 



adult brain 



Invitrogen 



ABR013 



185 204-205 
687 692-694 
1413 1640 



364-365 
830 845 



393 497 595 
1068 1320 



adult brain 



Invitrogen 



adult brain 



Invitrogen 



adult brain 



Invitrogen 



adult brain 



Invitrogen 



ABR014 



ABR015 



ABR016 



ABT004 



cultured 
preadipocytes 



Strategene 



ADP001 



187 301 357 364-365 375 454 463 
731 859 939 983 1073 1262 1270 
1320 1403 1640 1651 1657 1696 
1722 1738 



419 434-435 
1320 



441-442 763 789 983 



312 364-365 379 
1674 1722 1785 



1320 1334-1335 



14-16 22-23 
70-72 78 86 
137 143 146 
196 198 



194 
295 
338 
371 
399 
4 82 
557 

655 
696 
750 
814 



25 37-39 43 58 60 
94 107 113 116 136- 
152 161 173 182-184 
210 218 229 259 267 



298 309-310 320-321 324 
346-347 349-350 356-357 
382-383 391 393 
428 438 459 461 
507-509 516 526 
602 607-609 624 
671-672 687-689 
715 721 732 739 



379-380 
401 408 



490 
562 
667 
710 
753 
826 
894-895 
951 963 
1005-1006 
1037 1052 



502 
597 
669 
712 
766 
830 
925 



778 
837 
937 
968-969 

1016-1019 
1086 109O 



336- 
362 
396 
476 
531 
652 
695 
743 
803 
874 
960 



780-781 789 
841 857 869 
949 954-956 
988-989 1000 

1021 1036- 
1109 1113 



1115 
1137 
1170 
1225 
1280 
1341 

1378-1379 
1423 1429 
1452 
1525 
1554 
1585 
1608 
1627 
1666 
1723 
1779 



1120-1121 1123-1124 1136- 
1140 1144-1147 1151 1167 
1174 1188 1193-1194 1205 
1229 1231 1254 12S8 1262 
1285 1309 1312 1334-1335 
1343-1344 1356-1357 1370 
1383-1384 1403-1404 
1434 1442 1448 1451- 
1454 1470-1472 1482 1499 
1S28-1529 1532 1S36 1547 
1561-1S62 1567 
1595 1601-1604 
1615 1619 1624 
1647 1660 1664 
1696 1704 1715 
1760-1761 1768 



1557-1559 
1588 1590 
1610-1613 
1640 1644 
1670 1675 
1727 1738 
1785-1786 



5-8 11 17 25 68-69 80 82 87 103 
105 110 116 136-138 168 171 188 
189 196-198 261 267 276 288 293 
301 318 331 336-338 379-380 391 
400 428 430-431 510-512 S20 524 
527 549 5S7 561 602 618 620 622 
631 637 647 670 681-682 710 731 
748 782 793-794 817 834-836 843 
845 858-859 879 882 893-895 934 
960 982 986 995-996 1000 1002 
1005-1007 1025 1027-1028 1032 
1039 1045 1071 1078 1097 1099- 
1102 1136-1137 1140 1219-1220 
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Tissue Origin 



RKA Source 



Hyseq 
Library Name 



SEQ ID NOS: 



1260 
1322 
1370 
1437 
1602 
1660 
1711 
1760 



1271 
1329 
1371 
1466 
1608 
1662 
1719 
1761 



1297- 

1339 

1398 

1468 

1614 

1673 

1720 

1765 



1298 
1345 
1408 
1533 
1631 
1687 
1742 
1767 



1314 
1365 
1423 
153 9 
.1649 
•1688 
1746 
1771 



1320 
1366 
1431 
1594 
1650 
1696 
1749 
1785 



adrenal gland 



Cl on tech 



ADR002 



adult heart 



GIBCO 



AHR001 



4-10 15-16 25 29-31 43-45 47 50- 
51 5S 60 62-63 65-66 75 80 102 
116 118 122 126 130 137 150 169- 



181 192 
247 2S1 
285 295 
351-352 
400 410 
434-437 
483 491 



170 
228 
281 
349 
391 
431 
477 
519 
581 

628-630 
713 715 
773-778 
869 875 
930-931 
976-977 
1004 
1076 
1134-1135 
1181 1188 



201-203 215 227- 
267-269 271. 280- 
311 336-338 342 
372-373 383-385 



527 535 
588 595 
637 
719 
789 
883 
942 
981 
1049 1055 
1112-1113 



198 
255 
298 
354 

415-416 424 
439 445 454 
493 497-498 
546 549 552 
600 602 
645-646 
732 734 
816 829 
898 904 
948 952 



1227 
1280 
1325 
1348 
1387 
1426 



1231 
1285 
1327 



1365-1366 
1398 1400 
1436 



1463-1464 

1538 1546 

1598 1609 

1627 1634 

1671 1674 

1703 1717 
1765 



426-427 
461 473 
503 516 
572-573 
608-610 620 
670 679 703 
744-746 758 
837 845 848 
912 922-923 
965 967 969 
990 992-993 1001 
1059 1071-1072 
1115 1121 1127 
1158 1163 1175 
1218 1224-1225 
1270-1271 1274 
1293 1307 1324- 
1342-1343 1345 
1369 1378-1379 
1405 1417 1425- 
1440-1441 1444 1454 
1488 1491 1507 1512 
1567 1573-1575 1588 
1614 1618 1622 1624 
1636 1649 1651 1658 
1678-1679 1691-1692 
1727 1731-1732 1737 



1151 
1209 
1243 
1290 
1330 



4-8 10-11 15-16 18-21 34-39 44- 
46 50-52 57-58 60 62-63 71 75 82 
85 87 89 94 97 100 103-104 108- 
110 112 114 116 118-119 122-123 
127 130-132 134 136-138 141-144 
147-151 153 163-164 168-171 179 
186 192 195 197 199 204-205 
215 220 225-226 229-230 232 
236 251 257-260 262 265 272 
277 280-2B2 285-286 

307 309 
336-338 
361 368 



298-301 304 
325 330 333 
352 354 358 
384 387-338 
408-409 
433-439 
457 459 
483-484 
503 506 
526 534 
560-562 
587 589 



289-292 
314 321 



3 93 



391 
411-412 
445-446 
462 469 
487-490 
508 510-513 
536-540 542 



345 
370 
397 



349 
380 
401 



212- 
234- 
274 
296 
324- 
351- 
383- 
406 



414-416 
449 452 
472-473 
492-493 
516 
546 



574-577 
593 595 



581-582 
597 



430-431 
454-455 
476-480 
496-498 
519-522 
549 553 
584 586- 
604-609 611- 



612 615-620 622-623 626 632 637 
645-652 656-660 665-666 670-672 
674-675 683-684 687 692-694 697 
701 709 712 715-716 719-720 725- 
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Tissue Origin 



RNA Source j Hyseq 

Library Nam; 



adult kidney 



GIBCO 



AKD002 



SEQ ID NOS: 



726 728 730-732 
744 746 751 753 
771 775-780 785 
804 810 812 817 
637 843 845-847 
869 871 
890-892 
906-907 
927-928 
967 
990 



863-864 
883 887 
901 903 
921-925 
9S1-963 
980-986 
1007 
1023 
1043 
10S9 
1072 
1089 
1109 
1124 
1145 



735 738-739 743- 
759 761 765 770- 
788-790 796 802 
821 826 828 830 
849-853 857-861 
875 877-879 881 
894-895 897-838 
911-913 915 919 
933-935 945 958 
969-972 975 977-978 
992 999-1002 1005- 



1010 1016 1019-1020 1022- 
1025 1028-1037 1039-1040 
1047 1050 1054-1055 1057 
1063-1064 1G67-1068 1070 
1075-1076 1083 1085-1087 
1093-1094 1104 1106 1108- 
1113 1116-1117 1119 
1126 1128 1131-1134 
1151 1158 
1177 1192 
1206-1208 
1227-1229 
2243-1244 



1121 
1144- 
1167 
1196 
1211 
1232- 
1247- 



1148-1149 
1169-1170 1175 
1199-1200 1202 
1216 1218 1222 
1235 1238-1241 
1248 1250 1253-1254 1256-1258 
1261 1268 1270-1271 1277 1280- 
1282 1287 1292 1298-1299 1306 
1308 1317-1321 1324-1325 1330 
1332 1334-1337 1339 1344-1345 
1354-1356 
1369 1371 
1383-1384 
1409 1417 



1349-1350 
1365-1366 
1378-1380 
1400 1403 



1437 1439 1442 1444 
1450 1453 1468 1470 
1481 1488 1490 1501-1504 
1521 1S24 1528 1530-1534 
1537 1539 f 1541-1542 1547 
1555 1560* 2565 1567-1571 
1S91 1S97-1598 1601-1602 
1619-1620 
1634 1636 



1359-1360 
1374-1375 
1389 1397 
1423-1426 
1446-1447 
1473 1479 



1614-1616 
1630-1632 



1519 
1536- 
1553 
1588 
1605 
1623-2628 
1641 1644- 



1645 1647 1649 1652-1655 1659 
1662 1667 1673-1674 1680-1681 
1684 1686-1688 1704-1705 1709 
1711-1712 1717 1724 1726-1727 
1731-1733 1737-1738 1741 1743- 
1744 1749 1754-1755 1760-1761 
1765 1772 1785 



4-8 10-11 17-21 29-31 35-39 42- 
45 50-51 56-58 60-61 64 68-69 75 
77 80 82 35 87 92-94 97 100 102- 
104 107-108 112 116-117 119 123 
127-133 136-137 139-141 143-144 
147-154 157 161-163 165-166 169 
172 176 178-179 192 194-197 199 
201 203-206 209-210 212-213 215- 
216 223-228 234-236 238 247 251- 
253 257-259 261-262 265-269 271- 
272 274 276-277 279-281 234-286 
290 293 29S 298-299 301-302 304 
307 311-313 321 325-326 329-331 
333 341 344 348-350 352 356 358- 
359 362 364-365 368 370-372 374 
376-377 380-382 392 395 398 400- 
401 404 407-409 414-415 
430-437 443-444 446 449 
45S 459 461-462 464 467 



474 476-477 480-481 483 



423-424 
451 453- 
469 471- 
487-488 
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RNA Source 



Hyseq 
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SEQ ID NOS: 



490- 


-491 


493 


497- 


505 


510- 


513 


516- 


S20 


522 


524 


526- 


529 


534 


537- 


540 


544 


547 


549 


554- 


556 


560 


562 


564 


567 


571- 


-576 


578 


582 


586- 


589 


592- 


593 


598- 


■599 


601 


604- 


-606 


6C8- 


613 


615 


-619 


621- 


-626 


632- 


-634 


637- 


643 


645* 


-6S2 


655 


660- 


664 


669- 


■672 


676 


678 


-679 


688 


692- 


695 


698 


702 


711 


713 


717 


719- 


-720 


727 


731 


735- 


736 


738 


743 


745- 


•746 


751 


753 


755 


762- 


763 


765 


771- 


-773 


775 


-778 


780 


786 


788 


793 


795-796 


BOO 


803 


805 


808 


810 


-812 


814 


-819 


821 


826 


829 


832 


834 


-838 


842-845 


848 


-855 


857- 


•861 


864 


-865 


867 


869 


B71 


874 


876- 


883 


836 


-887 


889 


-891 


893 


-896 


898- 


-900 


302 


906 


-908 


910- 


-914 


918 


920 


922 


925 


-927 


929 


-93 5 


93 7 


940 


-942 


945 


948 


-94 9 


951 


953- 


-958 


960 


-961 


963- 


964 


969 


-970 


972 


976 


-978 


982- 


-986 


988 


-990 


9 92 


-993 


995 


-997 


999- 


-1002 



adult kidney 



Invitrogen 



AKT002 



1004-1008 1010 



1012-1013 1016- 
1022 1025-1031 
1042 1044 1047 
1057-1064 1068 
1085-1086 1088- 
1097 1099-1102 
1116-1119 1121 



1017 1019-1020 
1035 1038-1040 
1050 1054-1055 
1070-1073 1078 
1089 1092 1094 
1107 1109-1112 
1123-1125 1132-1135 1140 
1143 1146-1147 1149-1150 
1154 1157 1159 
1178-1179 1181 
1200 1202-1204 

1219 1221-1222 1225 1227-1230 
1232-1234 1238-1241 1243-1244 
1246-1247 1253 1257-1258 1260- 
1261 1267-1268 1270 1272-1274 
1283 1287-1239 1293-1295 
1308 1311-1313 1317- 
1329-1330 1334-1335 
1349-1350 
1369 1373 



1163 1167 
1183 1192 
1206-1211 
1225 



1407-1409 
1428-1431 
1443 
1454 
1475 
14 93 
1509 



1400 
1419 



1397 
1417 
1433 
1445-1446 
1459 1461 
1478 . 1484-1488 
1495 1497-1498 



1353-1357 
1375 1378- 
1403 1405 
1423-1424 
1437-1438 1442- 
1448-1450 1453- 
1465-1468 1474- 
1490 1492- 
1506-1507 



1512 



1518 1521-1522 1525 
1527-1528 1532-1533 1537 1540- 
1541 1547-1550 1552 1556-1559 



1561 1565-1566 
1578-1579 1583 
1591-1592 1594 
1604 1606 1608 
1618-1622 
1634-1636 
1646-1649 
1666-1667 
1683-1684 
1696-1699 1701 
1714 1716-1719 
1727 1733 1737-1738 
1744 1748-1749 1751 



1568 1571 
1586-1587 
1598 1600 
1611 1613 
1624-1628 
1638-1639 
1653-1656 
1670-1671 
1686 1691-1692 
1709-1711 1713- 
1723-1724 1726- 
1741 1743- 
1760-1761 



1763-1768 1778 1780 



20-21 37-39 47 52 57 
68-69 80 104 107-108 
136-137 140 142-143 149 169 174 



1785 

60 65-66 
122 130 133 
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Tissue Origin 



RNA Source 



Hyseq 
I*ibrary Name 



SEQ ID NOS: 



181 197 227-228 
261-265 267 280 
301 304-30S 309 
344-345 349 358 
383 387 392 401 
443 445 449 453 
504 506 513 516 
540 546 554 S8S 
607 616-617 626 
664 695 709 721 
775-777 788 796 
838 849-850 852 
890-892 898 903 
925 927 934 941 
962 968 970 100 
1044 1052 1055 
1073 1085 1099- 
1111 1113 1115 
1136-1137 1146- 
1192 1196 1199 
1256 1264 1272- 
1293-1294 1299 
1325 1330 1344 
1356 1369 1378- 
1419 1428-1429 
1463-1464 1467- 
1478 1486 1491 
1529 1534 1547 
1623 1629 1631 
1647 1652 1660 
1670 1673 1686 
1776 



235-236 244 251 

-281 286 290 299 

312-313 339 341 

370-372 376 382 

414 416 421 430 
-4S4 472 
519 522 

587 594 598 

-627 636 643 

735 743 761 

804 814 827 
-853 



437-488 
528 536 



602 
662 
768 
837 
881 
919 
960 



869-870 
905-907 914 
949 952 957 
0 1008 1029-1030 
1063 1067-1068 
1102 1107 1110- 
1119 1126 1134 
1148 1153 1159 
1232-1233 1241 
1273 1281 1285 
1312 1320 1324- 
1349 1351 135S- 
1379 1403 1414 
1436 1446 14S8 
1468 1470 1477- 
1509 1519 1527 
1596 1600 1619 
1634 1638 1643 
1664 1667 1669- 
1709 1727 1740 



adult lung 



GIBCO 



ALG001 



4-8 14 37-39 44-46 50-51 56 62- 
63 75 82 88 93 103-104 113 125 
133 140 143 150 152 154 157 162 
171-172 174-175 190-191 196 200 
211 214 219 223-224 227-228 251 
252 256 265 272 274 280-281 285 
310 332 345 351 362 371 381-382 
394 408-409 431 436 445 454 459 
461 467 469 471 476-477 488 504 
513 527 537-S40 544 547-548 554 
564 583 607 616-617 
634 645-646 662-664 
719 743-744 763 766 
Bll 814 817 831-832 
852-853 858-859 861 
901 905 941 954-957 
979 981 987 990 992 
1005-1006 1014 1017 



623-624 
695 716 



621 
670 

774 789 
B37-838 
866 880 
966 971 
996 1001 
1045 1047 



303 
845 
887 
977 



lymph node 



Clontech 



1054 1059 
1086-1089 
1136-1137 
1190 1200 
1273 1280 
1331-1332 
1384 1404 
1442 1474 
1525 1531-1532 
1554 1571 1598 
1627-1629 1632 
1569 1676-1677 
1731 
1786 



1062 
1094 
1142 
1208 
1282 
1353 
1409 
1478 



1064 
1107 
1150 
1220 
129S 
1374 
1423 
1494 
1547 
16C6 
1642 
1684 



1072 
1126 
1157 
1241 
1306 
1379 
1434 
1509 
1549 
1613 
1644 
1696 



1080 

1134 

1173 

1272- 

1320 

1383- 

1436 

1522 

1553- 

1624 

1662 

1727 



1732 1737-1738 1748-1749 



ALWOOl 



4 24 50-51 82 105 137 153 198 
201 223-224 234 268-269 272 280 
281 287 301 312 329 343 382 421 
430 433 445 451 461-462 475 481 
482 503 526 529 537-540 546-547 
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Tissue 


Origin 


RNA Source 


Hyaeq 






SEQ 


IX> NOS: 












Library Name 


























621 


626 649 


679 


719 


725- 


72 6 


738 










793 


803 831 


834 


-836 


838 


844 


857- 










858 


866 879 


905 


913 


928 


963 


976 










1005 


-1006 


1012 


103 8 


1050 


1116- 










1117 


1151 


11 


99 


1204 


1226 


124 


3 










1265 


1274 


13 


24- 


1325 


1339 


1353 










1374 


1377 


14 


40- 


1441 


1447 


1504 










154 9 


1600 


16 


18- 


1619 


1631 


164 


1 










1644 


1653 


16 


87- 


1688 


1691 


-1692 










1741 


1771 














young 


liver 


GIBCO 


ALAN) 01 


5-8 


11 20 


-21 


. 46 


50-. 


SI 58 


65- 


66 








75 7 


9 82 


93 


97 


102- 


103 108 110 










116 


13 9 143- 


144 


148 


-149 


171- 


172 








■ 


174 


187-189 


194 


-195 


198 


209 


214- 










215 


230 250 


258 


267 


-269 


280- 


281 










306 


309 342 


351 


356 


359 


362 


372 










374 


392 394 


398 


401 


407- 


4C8 


410 










414 


431 444 


455 


459 


476 


478 


483 




* 






493 


510-512 


516 


520 


522 


526 


536 










549 


571 574- 


577 


585 


592 


6C1- 


602 










607 


621-624 


628 


-630 


632- 


633 


637 










648 


660 666- 


667 


67 8 


697- 


698 


700 










717 


719 728 


730 


734 


738 


744- 


745 










766 


770 773 


779 


788 


800 


808 


812 










814 


841 849- 


851 


871 


874 


879 


887 










893 


898-900 


902 


-904 


906- 


907 


911 










919 


922 924 


934 


953 


957 


963 


965 










970 


984 986 


997 


1001 1004 1007 










1012 


1029 


-1030 


1033 


-1034 


1052 










1061 


1066 


1070 


1076 


1086 


1089 












1099 


-1102 


1110 


-1112 


1116- 








• 


1117 


1119 


1121 


1125 


1136 


-113 


>7 










1144 


-1145 


1156- 


1157 


1159 


1196 










1199 


-1200 


1209 


1211 


1219 


-1220 










1241 


1244 


12 


62 


1270 


1275 


1279 










1283 


12 95 


13 


17- 


1320 


1332 


133 


9 








■ 


1344 


1359 


1362- 


1363 


1379 


1383- 










1384 


1403 


14 


15 


1430 


-1431 


1437 










1450 


1467 


14 


75- 


1476 


1483 


-1484 










1494 


-1495 


14 


98 


1505 


1512 


1516 










1518 


-1519 


1526 


1529 


1547 


1550- 










1552 


1557 


-1559 


1S65 


1583 


156 


17 










1597 


1609 


16 


.14 


1620 


1631 


163 


17 










1641 


1644 


1654- 


1655 


1662 


1667 










1669 


1684 


1691- 


1692 


1702 


1711 










1725 


1738 


1741 


1743 


-1744 


1758 










1760 


-1761 


1763- 


1765 


1769 






adult 


liver 


Invitrogen 


ALV002 


5-8 


17 20 


-21 32 


-33 41 55 


58 


64 










75 77 86 


89 


102 


108 


117 


119 


175- 










I 176 


198 200 


209 


231 


235- 


236 


250 










272 


275-2 


76 


284 


306 


316 


321 


325 










333 


356 3 


59 


374 


376 


398 


4 01 


408 










• 414 


428 4. 


30 


433 


-435 


454 


476 


494 










503- 


505 517- 


518 


528 


534 


54 4 


552 










561- 


563 567 


578 


581 


608- 


609 


630 










632 


637 644 


650 


661 


665 


672 


702 










707 


710 721- 


722 


750 


753 


778 


782 










794 


814 820 


826 


834 


-837 


847 


849- 










850 


858 861 


874 


879 


893 


898 


904 










911 


918 921- 


922 


926 


946 


948 


972 










978 


986 996 


1020 1027 1031 1034 










1053 


1063 


1068 


1070 


1073 


1086 










1089 


1093 


1097 


1113 


1119 


1156 










1159 


1195 


1198- 


1199 


1208 


1220 










1227 


1241 


12 


61 


1272 


-1273 


1277 










1285 


1308 


13 


15 


1320 


1324 


-1325 










1330 


1362 


-13 


63 


1375 


1403 


14C 


18- 










1409 


1415 


14 


31- 


1432 


1435 


1467 










1469 


1482 


1504 


1524 


1542 


154 


7 
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Tissue Origin 



adult liver 



adult ovary 



RNA Sovtrce 



Hyseq 
Library Name 



Clan tech | 



ALV003 



Invitrogen 



AOV001 



SEQ ID NOS: 



1550 
1597 
1618 
1647 
1669 
1738 
1765 



1567 
1601 
1619 
1652 
1671 
1742 
1772 



1578 
1602 
1621 
1654 
1684 
1744 
1774 



1581 
1611 
1625 
1655 
1706 
1760 



1583 
1612 
1637 
1660 
1722 
1761 



1594 
1615 
1645 
1666 
1737 
1753- 



29 676 997 1063 1119 1536 1766 



1 4-18 20-23 29 35-4 
51 53-58 61-63 65-66 
77-78 80 82 85 87 89 
103-104 106-108 110 
122-124 126 128 133- 
142 145-147 149-157 
170 174 177-173 180 
189 192-203 207 209 
221-224 229-230 234 
247 255 258 260-262 
272 274 277-281 284- 
295 299 301-302 304 
313-314 316 321 323- 
333 335-338 341 344 
356 358 360 362 370- 
379-384 387 390-392 
400 403 408-410 412 
424 426-427 430-435 
448-449 451 453-455 
471 473 476-479 481- 
494 496-497 499-501 
514 516-517 519-S20 
528-534 541-544 546- 
554-555 561-564 566- 
572-573 575-576 579 
588 590-591 593 595 
605 607-613 615 618- 
630 632-633 636-640 
649-652 654-6SS 657- 
677-678 681 683-684 
710 714-721 723 725- 
734-735 743-746 750- 
763 765 767 772-773 
783-784 786 78e 790- 
800 803 805 809-811 
819 821-824 826 828- 
837-838 843-850 852- 
867 869 871-872 874- 
887-888 890-895 898- 
916 919-922 924 926- 
941 943-946 948-951 
961-964 966-967 970- 
985-986 988-990 992 
1001 1004-1009 1011- 
1019-1020 1024-1025 
1037 1039 
1054-1060 
1072-1073 
1085-1086 
1098-1103 
1119-1120 
1142-1143 
1158 1163 
1173-1175 
1190- 



1033-1035 
1050-1051 
1067-1070 
1078-1079 
1094-1096 
1112-1117 
1131-1135 
1153 1156 
1169-1171 
1180 1183-1185 
1197-1200 1202 
1219 1221-1226 
1241 1243-1244 
1254 1256-1258 
-1268 1270 1275 
1286-1289 1291 



1205- 
1232- 
1247 
1262 
1278 
1293- 



0 42-48 50- 
68-69 73-75 
97 100-101 
113 115 118 
134 136-140 
161 166 168- 
182-186 188- 
211-215 219 
242-243 246- 
265-269 271- 
286 288 290 
307 309-311 
326 330 332- 
349 352-353 
372 376-377 
394 397-398 
414-416 423- 
439 443-446 
462-463 468- 
484 487 489- 
503-505 509- 
522 524 526 
547 549 552 
567 569-570 
S81 S03 585- 
597 599 601- 
622 624-627 
642 644-647 
665 667-675 
692-695 697- 
727 729 732 
7S1 753 758 
775-778 780 
791 794-796 
813-815 818- 
829 831-832 
857 859-864 
875 878-883 
910 912-914 
927 929-939 
953 955-958 
979 981-982 
995-997 999- 
1013 1016 
1029-1031 
1041-1047 
1062-1064 
1075-1076 
1089-1090 
11C6-1108 
1123-1127 
1146-1149 
1165-1166 
1177-1178 
1191 1195 
1214 1217- 
1235 1238- 
1249 1252- 
1265 1267- 



1280-1283 
1294 1298- 
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Tissue Origxn | RNA Source | Hyseq 

Library Name 



SEQ ID NOS; 



Clontech 



APL001 



placenta 



Invitrogen 



APL002 



1299 1306 
1323 1327 
1338-1339 
1359 1361 
1377-137S 
1394 1400 
1427 1425-1431 
1443 1445-1450 
1463-1464 1466 
1481 1484-1485 
1494 1496-1498 
1507 1511-1517 



1308 1312 1317-1321 
1329-1330 1332-1333 
1341 1343-13S1 1356 
1365-1366 1371-1375 
1383-1384 1386 1389 
1404 1416-1417 1422- 
1435-1436 1439- 
1453-1454 1459 
1468 1470 1474- 
1488 1491 1493- 
1501-1504 1506- 
1519 



1521-1524 
1526-1527 1530-1531 1534-1536 
1538-1539 1541 1546 1548-1550 
1553 1555-1559 1561-1563 1566- 
1569-1570 1572 1574-1575 
1580-1581 1587-1588 1590- 
159S 1597-1598 1600-1606 
1611-1621 1623-1630 1634 
1638 1641 1643 1645 1647- 
1G59-1662 1664 1667 1669- 
1673-1674 1676-1681 1683- 
1699 1702-1707 1710-1711 
1713-1714 1716-1719 1723-1724 
1726-1728 1731-1733 1735 1737- 
1738 1740-1741 1743-1744 1748- 
1751 1753 1755-1756 1760-1762 
1765 1767-176B 1770-1771 1776 

17 78-1779 1783-1784 1786 

5-8 44-45 90-91 107-108 
311 351 414 476 503 545 
636 719 755 773 860 
947 955-956 962 990 
1045 1202 1320 1369 
1713-1714 1743-1744 



1567 
15/8 
1591 
1609 
1636 
1657 
1671 
1690 



159 178 
574 624 
890-891 924 
992 1002 
1628 1686 



14-16 26 
106 116 



43 60-61 79-80 
171 177 180 194 



198 210 
309 329 
423 430 
491 517 
738 746 
858 916 
1005-1006 
1068 1070 
1160 1277 
134S 1429 
1486 1490 
1592-1593 
1664 1673 
1746 1776 



29 
135 

216 235-236 
334 339 359 
434-435 448 
522 631 723 
769 818 843 
948 953-954 
1013 1033 
1086 1139 
1285 
1435 
1512 
1602 
1675 



103 
196 
299 
417 
490- 
728 
857- 



272 290 
379-380 
454 483 
725-726 
854-855 
976 988-989 
1036 1064 
1144-1145 
1317-1320 1343 
1438 1454 1482 
1519 1532 1549 
1626 1647 1649 
1722 1727 1730 



adult spleen 



GIBCO 



ASP001 



3 5-8 12 15-16 19-21 24 
44-45 57 60 82-83 87 89 



103 106 108 
147 152-153 
178-180 196 
21S 219 234 
272 280-281 
325 333 341 
387 394 406 
448 451 473 
505 517 519 
554 557 



117 119-121 
155 166 169 
198 201-206 
253-254 256 
290 295 302 309 
349 358 372 382 
414 431 434-436 
481 490-493 500 
530 534 536-540 



29 34-36 
94 98-99 
139 141 
171 174 
209-211 
258 264 



611-612 
652 659 
700 721 
746 762 
810-811 
852-853 



574-576 
620-621 
661 667 



312 
386- 
446 
503 
547 
604 
642 
684 



728 
765 
817 
858 



730 
774 
822 
862 



582 592 595 
623 631-632 
671 673-675 
732 738 742-744 
780 788-769 794 
830 832 845 848 
866 874 879 882 
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Tissue Origin 



RNA Source 



Hyseq 
Library Name 



testis 



GIBCO 



Genomic DNA 
erora BAC 63118 



Research 
Genetics 
(CITB BAC 
Library) 



Genomic DNA 
from BAC 35316 



Research 
Genetics 
(CITB BAC 
Library) 



BAC002 



SEQ ID NOS: 



884 906-908 
927 934 94?. 
978 983 990 
1005-1007 1010 1012 
1042-1044 1046 1049 
1070 1076 1089-1090 
1113 1115 1124 
1174 1177 
1226-1227 
1258 1269 



912 919 921-923 y^b- 
949 957-9S8 963 977- 
992-994 996-997 999 
1031 1036 
1059 
1094 
1140 
1196 
1236 
1274 



1190 
1229 
1271 
1330 
1353 
1386 
1436-1437 1439 
1480 1485-1487 
1525 1544-1549 
1591 1600 1631 
1651 1654-1655 16S8 1662 
1674 1678-1679 1684 1686 
1727 1733 1738 1740-1741 
1761 1774 1779 1781-1782 



1334-1335 
1359-1360 
1397 1413 



i-8 10 26 30-31 47 
J9 82 84-85 97 102 
139 150 152 
176-177 192 
227-228 247 
288-289 301 
349 370-372 
427 430-431 
469 473 477 
503 513 522 
564 572-573 
599-602 605 



196-197 
258 261 



637 647 
674-675 
738 744 
802 804 
843 845 
913 916 
960 963 



154 
194 

255 

307 311 
392 398 
433 437 
481-482 
526 547 
575-576 
612 
649-650 
712 719-721 
746 773 780 
809 811 814 
848 8S9 S66 
919 921 926 
971 975 977 



50-51 57 68- 
113 119 137 
156 163 169 174 



212-215 
282 285 



615-617 
656 660 



552-553 
581-582 
620 



1007 1016 1029-1030 
1038-1039 1045 
1070 1072-1073 
1099-1102 1104 
1149 1161-1162 
1222 1227 1229 
1238-1239 1243 
1289 1291-1293 
1320 1330 1332 
1373-1374 1379 
1409 1423-1424 
1443 1459 1484 
1496-1497 1501 
1527 1530-1531 
1549 1563 1565 
1577 1586 1591 
1628 1630-1632 



665 670 
728 731 
78J8-789 
831 837 
877 905 
937 950 
990 992 
1034- 
1059-1060 
1087 1089 
1108 1113 
1175 1208- 
1231 1235 
1285 1287- 
1311 1317- 
1345 1369 
1399-1400 
1435-1437 
1490 1493 
1509-1513 
1537 1546 
1569 1571 
1602 1625 
1639 1642 



1649 1661-1662 1666-1667 1670 

1675' 1684 1650 1699 1705 1712 

1717 1724 1730 1737-1738 1752 

1767 1779 



686 



1412 



1411-1412 
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Tissue Origin 



RNA Source 



Hyseq 
Library Name 



SBQ ID NOS 



Genomic DNA 
from BAC 39316 



Research 
Genetics 
(CITB BAC 
Library) 



BAC003 



1352 



adult bladder 



Invitrogen 



BIiDOOl 



bone marrow 



Clontech 



BMD001 



5-8 17-18 22-23 33 37-39 56-57 
80 93 100 120-121 169 201 237 
251-252 272 278 311 348 3S3 382 
413 415 424 430 443 483 502 542- 
543 562 564 607 616-617 626 635 
652 667 671 710 727 755-756 762 
773 786 789 837 840 866 893 898 
909 918 929 966 977 983 1016 
1025 1055 1073 1082 1140 1167 
1185 1189 1199 1270 1369 1481 
1536 1560 1573 1596 1614 1636- 
1637 1649-1650 1654-1655 1658 
1669 1671 1690 1719 1727 1731- 
1732 1739 1741 1760-1761 1779 



3-8 11 13 18 29-31 33 35-36 40 
43-45 47-48 50-51 57 60 65-66 75 
80 82 8S 88-89 94 100 103 107 
110 115 118-119 124-125 133-134 
136-137 139-141 146 150 152-153 
155 161 163 168-170 172 178-180 
187 192-193 197-198 203-205 210- 
213 215 217 219 222 224-226 233 
235-237 242-244 255 258 260 263- 
264 266 273 276 278 283 286 290 

307 312-313 321 330 
352 357-358 370-371 
387 389 394 408 410 



295 
333 
382 
412 
437 



301-302 
339 343 
384-385 
416 421 



439 



461-462 
482 485 
513 516 
540 542 
569-577 
603-604 
632-633 
660 666 
701 708 
740-742 
773 
796 



424-427 
441-442 445 
471-472 475 
488 493 498 
519 523-524 
544-545 549 
581 583-586 
608-609 
636-637 



429-431 436- 
447 454-456 
477-479 481- 
500 503-506 
526 530 535- 
555 565 567 
588 593 601 
613-619 621-622 
642 649-650 656- 



775-778 
798 802 
832-833 
858-859 
890-892 
922-924 
952-953 
981 985 



670 672 674-675 
716 718-720 731 
744-745 752 761 
780 785-786 



679 683 
735-736 
765 772- 
789-791 



830 
855 
883 
914 
941 
976 

1002 1005-1007 
1028-1031 1033 
1042 1044 
1059 1061 
1079 1106 
1124 1126 
1145 1163 
1200 1202 
1228 1240 
1270 1278 
1291 1293 



810-812 
837-838 
866-867 
896 9C3 



823-824 826 
843-844 848- 
869 878-880 
905 908 912- 



1317-1320 
1346 1349 
1369 
1400 
1419 
1433 



927 930-931 
955-958 963 
987 990 
1013 
1035 
1047 1050 
1063 1066 
1110-1113 
1134-1135 
1172 1178 
1216-1217 
1246 1254 
1281 1295 
1299-1301 
1327 1331 
1353 1356 
1372-1374 1379-1380 
1403 1406 1408 1413 
1423 1425-1427 1430-1431 
1439 1443 1446-1449 1459 



937 939- 
969 973 
992 995 1000 
1016 1025 
1037 1039 
1053-1054 
1070-1071 
1115-1117 
1142 1144- 
1199- 
1227- 
1266 
1290- 
1314 
1343 
1367 
1394 
1417 



1197 
1224 
1261 
1287 
1308 
1339 
1361 



1463-1464 1482 1486 1493-1*94 
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WO 01/53312 



Tissue Origan 



RNA Soutc& | Hyseq 

Library Name 



SBQ ID NOS: 



bone marrow 



Clontech 



BMD002 



bone marrow 



Clontech 



none marrow 



Clontech 



adult colon 



Irrvitrogen 



BMD004 



1638-1639 
1653-1655 
1684 1686 
1713-1714 



30-31 35-36 
103 108-109 
174 177 180 
222 225-226 
273-274 284 
303-304 307 
330 334-335 
370-373 384 
414-416 
440 
478 
545 



1506 1509 1513 1521-1522 1^4 
1526 1528 1531 1536-1537 1543 
1546 1548-1549 1552 1554-1555 
1557-1559 1571-1572 1581 1569- 
1592 1597-1600 1609 1614 1621 
1626-1626 1630-1632 1634 1636 
1641 1646-1647 1651 
1661-1662 1676-1681 
1690 1702 1707 1711 
1717 1720 1722-1723 
1727 1737-1738 1740 1758 1767 
1772 1781-17 82 1785-1736 
11 15-16 19 
83-84 93 99 
139 169-170 
212-213 219 
255 259 264 
292 295 301 
316 324 326 
353 3S7 360 
397 403-404 
429-430 433-436 
465-466 472 475 
520 523 525 531 
569-570 581 583 590-591 
601 616-617 621 641 650 
659 671 674-675 679 684 
719 728 734 737-738 742 
774-778 790 811 814 818 
836 854-855 859 866 869 



879 
990 



884 
992 



889 
998 



892 904 



68-69 75 
118 137 
190 193 
232 237 
286 290- 
312-313 
348 352- 
386-387 
425-427 
451 454 
493 516 
5S2 566 
597-598 
652 656 
710 718- 
761 76S 
830 834- 
871 878- 
922-923 932 



421 
444 
491 
548 



1001 1004 1016 1036 



1042 


1048 


1051 


1054- 


1055 


1058 


1088- 


1089 


1106 


1112- 


1114 


1155 


1157 


1192 


1200 


1223 


1227- 


1228 


1236- 


1237 


1260- 


•1261 


1282-1283 


1285 


1287 


1295 


1314 


1317-1321 


1324-1327 


1330 


1333 


1341 


1343 


1347 


1350 


1353 


1355- 


13S7 


1367 


1369- 


13 70 


1373 


1377 


1379 


1381 


1383- 


1384 


1394 


1397 


1400 


1406 


1413 


1417 


1425- 


■1427 


1438 


1442 


1446 


1459- 


-1460 


1470 


1493 


150S 


1521 


1536 


1546- 


-1549 


1560 


1573- 


1574 


1578 


1598- 


-1600 


1621 


1626 


1631 


1634 


1646 


1649 


1653 


1656 


1658 


1669 


-1670 


1683- 


-1684 


1687- 


1688 


1690 


-1693 


1696 


2699 


1702 


1704 


1707 


-1709 


1711 


1720 


1722- 


1723 


1725 


1727 


1729 


1731- 


-1733 


1738- 


-1740 


1743- 


-1746 


1752 


1755 



1760-1761 
1786 



1767 1777 1781-1782 



73-74 



BMD007 



95-96 



503 922 1 036 
866 1320 



1711 



CXNOOl 



\ 



1475 

"17 56-58 103 110 117 144 ISO 171 
179 185 188-189 201 204-206 210 



218-221 225-226 
288 310 312 320 
394 408 420 455 
512 590-591 615 
672 684 697 710 
786 788 826-827 
858 866 872 898 



231 237 251 277 
333 359 386 388 
481 485 503 510- 
635 647-646 665 
725-726 743 780 
848-850 854-855 
918 921-923 953 



976 983 993 1005-1006 1017 1020 
1025 1027 1054-1055 1063 1068- 
1069 1140 1153 1170 118S 1196 
1199 1220 1280 1314-1315 1320 
2345 1351 1355 1369 1428 1439 
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Tissue Origin 



RNA Source 



Hyseq 
Library Name 



SEQ ID NOS: 



1512 1556 1583 1587 

1614 162S-1626 1631 

1650 1675-1677 1687- 

1713-1714 1724 1740 



1462-1464 
1594 1596 
1639 1645 
1688 1701 
1765 



Mixture of 16 
tissues - 
mRNAs 



Various 
Vendors 



CTL016 



401 1490 1686 



Mixture of 16 
tissues - 
mRKAs* 



Various 
Vendors 



CTI,021 



312 782 1132-1133 1403 1712 1715 



adult cervix 



BioChain 



CVX001 



1 4-8 11 13 18-21 25-26 30-31 33 
37-39 43 46-47 58 61 64-66 71 
73-74 82 85 94 100 103-104 113 
118 122 126 130 134 140 147 153- 
163 170 179 181 186 192 
198 201-202 218-219 222 
257 266 276-277 285-286 
304 307 



156 
IS 6 
231 
298 
326 
362 
388 



301-302 
329-330 332 
371-372 376 
398 400 410 
426-427 430-431 
448 461-462 464 
483 491 493 
516-517 526 
547 557 S61 
582 585-586 
602 604-605 
623 644 650 



335 
379 
414 



312-314 
342 352 
381-382 
416 



195- 
229- 
288 
324 
358 
384 



496 
530 



670 672 
708-709 
731-732 
765 771 
798 800 



680 
711 
737 



419-420 
433-436 439 446 
471-477 479 482- 
503 506 510-513 
535 542-544 546- 
572-573 575-577 581- 
588-589 593-594 600 
607-609 612 615-619 
654 657-658 662-665 
683 691-694 696 706 
713 720-721 727 
745-747 753-754 



774-777 780 790 793 
803 805 818 826 
832 834-836 843 847-848 
857-860 864-866 869 871 
880 882 887 890-891 897 
905-908 912-913 916 
927 932 934-938 944 



729 
760 
796 
828 831- 
851-855 
876 878- 
899-902 
918-919 922 
948 955-956 



958 963-964 967 
978-979 983 985 
1005-1007 1016-1017 
1033 1036 103B 1045 
1066-1067 1071 
1082 1098 1113 



1056 
1079 
1134 
1170 
1200 
1222 
1241 
1270 
1311 
1349 



1139 1146-1149 
1173 1175 1177 
1202 1211 1214 
1225 1227 1232-1234 
1243 1258 1264-1265 
1279 1287-1290 1308 
1316 1320 1323 1327 



972 976 
1000 
1024 1027 
1053- 
1075 
1129 
1167 
1197 
1221- 
1240- 
1268 
1310- 
1345 



969-970 
990 992 



1047 
1073 
1124 
1163 
1181 
1216 



1353-1354 



1383-1384 1386 



1360 1372-1374 
1394 1397 1405- 



* The 16 tissue-mRNAs and their vendor source, are as follows: 1) Normal adult brain 
mRNA (Invitrogen), 2) normal adult kidney mRNA (Inviirogen), 3) normal adult liver 
mRNA (Invitrogen), 4) normal fetal brain mRNA (Invitrogen), 5) normal fetal kidney 
mRNA (Invitrogen), 6) normal fetal liver mRNA (Invitrogen), 7) normal fetal skin mRNA 
(Invitrogen), 8) human adrenal gland mRNA (Clontech), 9) human bone marrow mRNA 
(Clontech), 10) human leukemia lymphablastic mRNA (Clontech), 11) human thymus 
mRNA (Clontech), 12) human lymph node mRNA (Clontech), 13) human spinal cord 
mRNA (Clontech), 14) human thyroid mRNA (Clontech), 15) human esophagus mRNA 
(BioChain), 16) human conceptional umbilical cord mRNA (BioChain). 
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Tissue Origxn 



RNA Source 



Hyseq 
Library Name 



SEQ ID NOS: 



diaphragm 



BioChain. 



DIA002 



endothelial 
cells 



Strategene 



EDT001 



1406 


1416 


1425- 


1427 


1431 


1436- 


1437 


1442 


1446 


1448 


1453 


1459 


1466 


1472 


1478 


1482 


1496 


1501- 


1503 


1506 


1512 


1522 


1527- 


1528 


1531 


1533 


1541 


1547 


1569 


1571 


1585 


1589 


1597- 


1598 


1600 


1608- 


1609 


1614- 


1616 


1620 


1623- 


1624 


1626-1628 


1630 


1638 


1641 


1643 


1649 


1653 


1656 


1662 


1667 


1669 


1674 


-167S 


1683 


1685 


-1688 


1699 


1702 


1709- 


1710 


1715 


1717 


1722 


1724 


1729 


1731- 


1732 


1735- 


1739 


1741 


1743- 


1744 


1748 


-1749 


1755 


1760 


-1762 


1767 


1773 


1778 


1785- 


1786 













137 282 289 730 
1478 1599 1614 



780 986 1409 



3 5-10 13 15-21 24-26 29 34 37- 
39 42 44-45 50-51 53-55 57-58 
60-61 65-66 68-69 73-74 77-78 80 
82-83 85 87 89 93-96 101-105 108 
110 112-114 116 118-122 124 128 
133-134 137-142 147-150 
166-172 176-179 
196-201 204-207 
224 225-230 233 



152-1S3 
187 190 
210 212- 
235-236 
261-262 265 
279-281 284- 
301-302 310- 
329 331-333 
360 371 375 
392 397 400 
416 425-427 
431 434-436 439 444-445 449 454 
463-464 472-475 477-479 486 488- 
490 497-498 500-504 510-513 516- 
515 522 524 526-528 532-534 536- 
540 542-546 548 561-563 566-567 



161-163 
1S2 194 
214 220 
240-241 
267-269 
28S 288 
311 313 
335 340 
380-382 
407-408 



251-252 258 
272 276-277 
295-296 
321 325 
351-355 
387 390 
412 414 



290 
316 
342 
364 
410 



572-576 
595 S97 
620 622 
644 647 



579 S81 
599 603 
626 630 
656-660 



585-586 
607-612 
632-634 
662-664 
707 
734 
768 771 
793 800 
816-818 
832 



678 680-682 692-697 
712-713 719 730 732 
743-746 751 759 
778- 783 786-789 
807 810-811 814 
824 826 828-829 832 834-838 
845 848-850 854-860 862 864 
871 874 876-879 883 885 887 
891 89^-895 898-900 903 908 
913 916 919-922 924 926-928 
935 939 943 948-949 951-954 
964 969-970 973 



589 593 
615-617 
638-641 
670 673 
709-710 
736 738 
773 775- 
8C3 805- 
821-822 
842- 



869 

890- 

910- 

930- 

957 



959-961 964 969-970 973 975-978 
983-984 988-990 992-993 996-997 
1000 1002 1004-1013 1016-1020 
1022-1025 1028 1031 1033-1034 
1038-1046 1050 1055-1056 1059- 
1060 1062-1064 1067-1070 1072- 
1074 1076 1078 1082 1086-1087 
1089-1090 1093-1097 1099-1103 
1107 1109-1113 1116-1117 1124- 
1126 1128-1131 1134-1135 1138 
1140 1144-1145 1148-1149 1153 
1157 1160 1163 1171 1183-1184 
1198-1199 1202 120S-1207 1211 
1216-1217 1219 1221 1225 1229 
1232-1235 1238-1241 1243-1244 
1246 1250 1253 1257-1258 1261 
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Tissue Origin 



RNA Source 



Hyseq 
Library Name 



Genomic clones 
from the short 
arm of 
chromosome 8 
esophagus 



Tetal brain 



fetal brain 



Genomic DNA 
from 
Genetic 
Research 



EPM001 



Bi( 



nn 



ESO0O2 



Clontech 



FBR001 



Clontech 



FBR004 



"fetal brain 



Clontech 



FBR006 



SEQ ID NOS: 



126S-1266 1268 
1277 1280-1283 
1290 1293 1295 
1317-1320 1324- 
1330 1334-1335 
1345-1347 1350 
1367 1369 1374 
1400 1406 1408 
1424-1426 1428 
1440-1442 1448 
1468 1472 1474 
1491-1493 1501 
1511 1516 1520 
1531 1536-1537 
1547 1549 1552 
1561-1565 1568 
1579 1581-1583 
1592 1597 1605- 
1615 1618-1621 
1631 1634 1636 
1650 1652-1659 
1669 1671 1675- 
1696-1698 1703 
1719 1722-1723 
1736 1739-1741 
1755 1760-1761 
1771-1773 1776 
286 686 1297 13 
1411-1412 1754 



1270-1271 
1285-1286 
1298 1308 
1325 1327 
1338 1342- 
1355-1356 
1376 1379 
1414 1417 
1431 1434- 
1450 1462- 
1478 1487- 
1504 1506 
1521 1526 
1539-1540 
1555 1557- 
1571 1575 
1587-1588 
1606 1611 
1624-1628 
1638 1641 
1664 1666 
1681 1683 
1711 1715 
1726 1731 
1743-1744 
1765 1767 
1779 1783 
03-1304 i: 



131-132 261 289 
1000 1007 1397 



380 S03 860 892 



62-63 89 112 126 194 322 336-JJ* 
379 391 411 481 546 563 607 679 
710 867 1012 1031 1055 1251 1262 
1320 1407 1643 1652 1686 1731- 
1732 1746 1765 



68-69 90-91 139 212-213 J01 iJl 
362 374 403 436 611 645-646 659 
668 670 691 785 805 845 1163 
1209 1216 1232-1233 1238-1239 
1387 1410 1416 1430 1496 1536 
1547 1S93 . 



5-9 25 43 60 62-63 65-66 70 7<J 
QO 87 92 101 103 108 114 136 139 
149 152-153 157 168 171-172 175 

212-213 221-226 237- 
266 272 279-281 295 
310 317-318 321-324 
336-338 346-347 352 
377 379-380 382 384 



207-208 210 
238 251-253 
301-302 307 
330 333-334 
357 370 373 
391-392 397 399 
411 417 421 424 
437 440-443 454 
476 483 488-489 
513 516 519-520 
544 547 550 561 
590-591 595 597 
623 628-629 631 
657-658 660 665 
689 691-694 
710 716 720 
744 757-760 
806-807 810 



402 406-408 
426-427 430 
460 464 467 
497 508 
530 537-540 
572-574 582 
607-609 
638-640 
674-675 
699 701 



858 861 
894-895 
936 938 
959 961 



495 
524 
567 

597 604 607-609 615 

631 634 638-640 655 

665 669 674-675 679 

696-697 699 701 706 

728 732 734 736 742- 
763 775-778 780 799 

817-818 826 839 843 

871-872 8B4 890-891 

904 915 921-923 935- 

950 952 955-956 958- 
967 969-971 990 992 
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"Tissue Origin ( RNA Source 



Hyseq 
Library Name 



fetal brain 



Clontech 



FBRS03 



fetal brain 



Invitrogen 



FBT002 



fetal heart 



fetal kictaey 



Invitrogen 



Clontech 



FHR001 



SEQ ID NOS: 



999 1001 1 
1016 1022 
1035 1042 
1065 1067 
1114-1115 
1151 1153- 
1172-1173 
1190-1200 
1226-1227 
1253-1255 
1270-1273 
1314 1317- 
1339 1341 
1371 1373 
1386 1392 
1425-1426 
1440-1441 
1502-1503 
1519 1536 
1559 1573 
1611-1614 
1640 1651 
1693 1696 
1718 1720 
1730-1733 
1742 1745 
1767 1771 
1786 

S 



005-1006 1008 1 
1024 1029-1030 
1047-1048 1052 
1070 1082 
1119 1131 
1156 1160 
1178 1184 
1211 1216 
1229 1231 
1258 1260 
1281 1287 
1320 1326 
1344 1350 
1376 1379 
1396-1398 
1428-1429 
1448 1466 
1507 1511 
1544 1549-1550 
1589-1590 1598 
1619 1621 
1657-1658 
1703-1704 
1722 1724 
1735-1736 



1089 
1143- 
1163 
1186 
1222- 
1236 
1262 
1308- 
1334- 
1356 
1381- 
1419 
1432 
1470 
1513 



1625 
1676 
1713 
1726 
1738 



1755 
-1772 



1759-1761 
1777 1779 



013 
1032 
1056 
1109 
1149 
1167 
1188 
1223 
1245 
1266 
-1309 
-1335 

1369- 
-1382 
1423 
1437 
1482 
1516 
1557- 
1608 
-1626 
-1679 
-1714 
1728 
-1739 
1765 
-1780 



235-236 S20 864 1068 1188 1587 



15-18 20-21 24-25 29 34 43 61-63 
77-78 98 101 103 107-108 128 130 
136 146 148 165-166 171 174 181 
185 196-198 204-205 208 223 230 
235-236 251 253 251 268-269 280- 
281 284-285 288 309-311 321 329 
339 346-347 350 357-359 381- 
390 407 418-419 430 434-435 
443-444 461 464-466 483 490 
S19 522 527 557 561- 
590-591 
650 655 
700-701 
788-789 



334 
383 
438 
494 
562 
632 
682 
746 
829 



897-900 
948-949 



509 516 
572-573 
647-648 
690-691 
782 784 
840-841 
904 
954 



595 597 623 
669-670 672 
710 717 736 
814-815 825 
847 8S4-855 857-858 
919 925 935-937 946 
960-962 966 969-970 
986 996 1000-1C01 1005-1007 1012 
1014 1022-1028 1045 1052 1055 
1068 1070 1072 1078 
1090 1109 111S 1118 
1136-1137 1144-1145 
1157 1193-1195 1198 
1220 1222 1234 1257 

1280 1285-1286 
1317-1320 1330 
1349-1350 
1369 1379 
1507 
1564 
1595 
1638 



1274-1275 
1312 1314 
1344-1345 
1358 1364 
1431 1435 
1536 1547 
1582 1587 
1615 1619-1621 
1665-1666 1673 
1715 1723 1728 
1759-1761 1765 
1778 1781-1782 



1476 
1554 
1593 



1082 1085 
1120' 1128 
1149 1156- 
1204-1205 
1262 1271 
1294 
1342 
1355-1356 
1383-1384 
1519 1532 



1567 
1601 
1644 



1687-1688 
1749 1753 
1771 1774 
1786 



1578 
1608 
1561 
1690 
1757 
1776 



105 124 160 289 864 1036 
1229 1614 1616 1762 1785 



1148 



FKD001 



5-8 11 40 47 57 
124 163 171 216 



65-66 82 85 102 
222 224 235-236 
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Tissue Origin 



RNA Source 



Hyseq 
Library Name 



SEQ ID NOS: 



258 277 280-281 307 
371 3 87 3 92 395 4*03 
436 443 455 469 500 
563 572-573 585 600 
654 657-658 660 679 
798 821 833 844 854 
868 878 911 929 958 
992 1007 1046 1087 
1139 1285 1312 1331 
1371 1376 1391 1422 
1440-1441 1470 1543 
1618 1631 1651 1654 
1678-1679 1691-1692 



310 314 330 
422-423 431 
519 522 542 
619 623 650 
719 731 760 

-855 857 864 
960 969 990 

1103 1129 
1355 1369 
1425-1426 
1598 1601 

-1655 1669 
1733 1785 



fetal kidney 



Clontech 



FKD002 



352 384 426-427 440 583 602 1060 
1131 1324-1325 1636 



fetal kidney 



Invitrogen 



FKD007 



20-21 82 163 335 679 988-989 
1000 1227 1230 1320 1554 



fetal lung 



Clontech 



FLG001 



35-36 94 323 371 398 426-427 445 
473 549 560 604 616-617 626 631 
649 651 719 746 786-787 832 842 
849-850 864 894-e95 1075 1178 
1182 1200 1206 1309 1311 1345 
1429 1493 1567 1576 1620 1686 



fetal lung 



Invitrogen 



FLG003 



9 15-16 
102 124 



29 41 47 68 
137 152-153 
229 231 249 254 256 
300 325 333 344-345 
379 384 408 425-427 
468 475 483 488 493 
545 547 549 564 582 
660 662-664 670 673 
761 766-767 774 805 
864 875 921 932 937 
988-989 1014 1016-1 
1090 1097 1170 1185 



1216 
1342 
1414 
1536 
1601 
1667 
1739 



1224 1258 1290 
1347 1355 1369 
1431 1438 1449 
1547 1557-1560 
1636 1644 1653 
1671 1675 1680 



-69 83 88-89 
165 196 224 
267 291-292 
352 373 376 
430 432 467- 
516 531 535 
602 623 644 
725-726 728 
830 852-853 
946 949 963 

017 1024 1027 
1200 1215- 
1309 1320 
1381 1413- 
1491 1512 
1567 1S90 

-1655 1662 

-1681 1706 



1760-1761 1769 



fetal lung 



Clontech 



FLG004 



103 276 334 465-466 
1614 1658 



737 843 1131 



fetal liver- 
spleen 



Columbia 
Universi ty 



FLS001 



3-11 13 15-21 25 30-39 41-48 50- 
51 54 56-58 60-66 68-69 72 75 
77-80 82-83 85 87 89 92-103 105- 
110 112 116-124 126-127 130 133 
135-139 141 144 147-149 152-153 
157 163-165 167-172 174 176-178 
180 186 188-190 193-194 196 198- 
200 202-206 210-214 219 221-231 
233-236 240-244 246-247 250-251 
255-256 258 261-265 268-269 272 
274 276-278 280-281 284-286 288 
293 295 299-301 304 306-307 309 
311 314 316 318 320-321 326 329- 
332 342 344-345 350 352-353 356- 
358 360 362 370-374 376 378-384 
386-387 390 392-393 400-401 403 
406 408 410-412 415 417 419 422- 
437 439-442 444-445 448 452-454 
456 459 461-470 472-479 481-483 
487-488 490-491 493 500-501 503- 
506 509-513 515-520 522-524 526- 
529 531 534 536-540 542 547-549 
553-554 561-562 564 567-568 571- 
576 579 581 583 585-597 599-605 
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Tissue Origin 



RKA Source 



Hyseq 
Library Name 



fetal liver- 
spleen 



Columbia 
University 



FLS002 



607^610-613 615-621 
628-634 636-640 644 
660 665 669-670 672 
681-682 684 690-695 
710 713-714 716-719 
731 734 736 
748 750-751 



623-624 
647-650 
674-675 
697 702 
725-728 
738 740-741 743-746 
7S9-766 768 772 7<74 - 



796 798 800- 
818-819 821- 
843-847 849- 
887 889-895 
919 921- 
953-958 
971 974- 
992-993 



777 779 783-788 793 
805 808 010-812 814 
B24 826-832 834-837 
867 869-876 878-883 
897-898 902 904-914 916 
928 930-937 939 945-950 
960-961 963-965 967 969 
978 980-983 986 988-990 
995-997 1000-1002 1004-1008 1012 
1014 1016-1019 1025-1026 1028- 
1033 1035-1036 1039-1044 
1049-1050 1053-1056 1058- 
1061-1064 1067-1070 1072- 
1076 1078 1082 1085-1087 
1089-1090 1097 1099-1103 1107- 
1113 1115-1119 1121-1123 1125 
1127-1128 1131-1134 1136-1137 
1144-1150 1153 1159-1160 1163 
1170 1175 1177-1178 1188 1190- 
1192 1195-1200 1202 
1211 1214 1216 1218 
1234 1237 
1251 1254 
1270-1273 
1287-1290 



1031 
1047 
1059 
1074 



1206 1208- 
1221-1222 

1225 1227 1234 1237 1241 1244 
1246-1247 1251 1254 1258 1261 
1266 1268 1270-1273 1277-1282 
1284-1285 1287-1290 1294 1299- 
1300 1306-1308 1313-1320 1324- 
1325 1327 1330 1332-1333 1338 
1341 1343 1345-1347 1349-1350 
1353-1360 1362-1363 1365-1367 
1369-1370 1372-1374 1376 1378- 
1381 1383-1384 1386 1389-1391 
1400 1402-1403 1405-1410 1413 
1415 1417-1419 1422-1429 1431 
1435-1437 1439-1442 1445-1446 
1454 1458-1459 1466- 
1474 1477-147B 1480 
1491-1493 1496-1498 
1509 1511-1512 1516- 
1529 1532 1536- 
1549-1550 1552- 



1448-1449 
1470 1472 
1482 1485 
1501-1507 
1519 
1541 



1524-1526 
1546-1547 



1554 1562 1564 1569 1572 1574- 
1575 1578 1581 1583 1587-1S88 
1594-1595 1597-1598 
1611-1612 
1620-1622 
1630-1632 
1653-1662 
1673-1674 
1701-1703 
1711 1713-1714 1718-1719 
1724-1727 1731-1733 1738 



1591-1592 
1600-1604 
1617-1618 
1627-1628 
1645-1651 
1669 1671 
1690 1696 



1614-1615 
1624-1625 
1634-1639 
1664 1667- 
1676-1688 
1706-1709 
1722 
1740- 



1741 1743-1744 1746 1748 1751- 
1752 1754 1760-176S 1767-1773 
1780 1783-1786 



3-11 13 15-21 26 29 32 35-3* *2 
44-45 48 SO-S1 54-55 57-S8 61 S4 
68-69 73-75 78 80 82 84 87 95-98 
100 103 105 107-108 110 
116-119 122-125 128 130 
145 147-153 155 157 159 
166 168 171-172 174-175 
188-189 193-194 196-198 



112-113 
137-139 
161-163 
177 181 
200-203 



124 



BNSOOCID: <WO 



0153312A1J_> 



WO 01/53312 



PCT/US00/34263 



"Tissue Origin 



RNA Source 



Hyseq 
Library Name 



SEQ ID NOS 



231-232 240-244 246-247 250-251 
258-259 262 264 268-269 272 275 
277 280-281 284 286 288 290-292 
295 298-299 301-304 306 308-310 
318 320-321 323 325 329 331 334 
342 348-349 352-353 356 359 368 
371 374 376-379 381-384 386-387 
392-393 397-398 400-401 403 410- 
413 421 423 426-427 429-430 433- 
436 43B 440 443 445 448 451-452 
454-455 460-463 465-467 469 471- 
473 475-476 478-479 481-483 487 
490-491 493-494 497 500-501 SOS- 
SOB 509-513 515-517 519-520 524 
526-531 534 537-542 544 547 5S2- 
554 556 558 561-562 564-567 571- 
577 583-587 590-591 593 595 597 
601 604-606 608-613 616-617 619- 
624 626-632 634 637-642 644 647 
649-652 654-659 662-665 669-672 
674-675 681-682 685 688 690 696 
698 700-703 707 709-710 713 717 
719-721 723-724 728 731-732 734 
737-738 742-745 748 752 754 759 
763-766 768 770 773-777 780 782 
784 786 791 795-798 801-802 805 
808 811-812 818 823-824 826-827 
832 834-837 839 843 846 848-856 
358-861 865 867 869 871 873-874 
876 878 881-882 887 889 892 894- 
898 901-902 904 906-908 913-915 
919 921-924 926-932 934-935 937 
939-941 943 946-947 950 953 958 
961 965-967 971 973-975 977-979 
981 984-985 990 992-993 995-997 
999 1001 1004-1007 1009-1011 
1013 1016 1020 1023 1025 1027- 
1031 1033-1035 1039-1042 1044- 
1045 1049 1053 1055-1056 1058- 
1059 1062 1064-1065 1067-1070 
1072-1074 1079 1082 1087 1089 
1093 1097 1099-1103 1105-1107 
1109-1114 1123 1125-1127 1132- 
1134 1140 1143-1145 1148-1150 
1156 11S8 1160 1163 1172-1173 
1177-1178 1181-1184 1190-1192 
1195-1197 1199 1204 1206 1208 
1211 1214 1216 1219 1227 1230 
1234-1235 1237 1240-1241 1243 
1245 1247 1256 1258 1260-1261 
1264 1268 1270-1271 1275 1278- 
1279 1284-1286 1288-1289 1299- 
1301 1306 1308 1312 1314 1317- 
1319 1323-1325 1327-1330 1334- 
1335 1339 1343-1347 1349-1350 
1354-1355 1357 1360 1362-1363 
1365-1367 1369 1372 1376 1378- 
1380 1386 1389-1391 1394 1400 
1403 1406 1409 1416-1419 1422- 
1427 1429 1435 1437-1438 1440- 
1442 1446 1448-1450 1453 1460- 
1461 1468 1470 1472 1474-1475 
1478 1482 1486 1490-1493 1496 
1498 15C0-1504 1506 1508-1509 
1511-1512 1516 1518-1519 1521 
1524-1528 1531 1536-1538 1543 
1547 1550 1554 1556 1564 1567- 
1569 1580 1587-1588 1591-1592 
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Tissue Origin I RNA Source 



Hyseq 
Library Name 



ietal liver 
spleen 



Columbia 
University 



FLS003 



fetal liver 



Invitrogen 



FliVOOl 



fetal .liver 



Clontech 



FLV002 



fetal liver 



Clontech 



FLV004 



fetal muscle Invitrogen 



FMS001 



SEQ ID NOS 



1597-1598 1600-1601 1611-1612 
1618-1628 1530-1631 1635-1638 
1641 1646-1649 1652 16S4-1659 
1661-1662 1664 1667-1669 1674 
1676-1679 1633-1684 1686-1688 
1691-1692 1699 1702 
1713-1714 1717 1719 
1727 1730-1733 1738 
1744 1748-1752 175B 
1763-1764 1767 1769 
1776 1779 1783-1786 
103 300 318 321 3i>2 
384 392-393 403 422 
435 440 444 453 503 
978 1064 1324-1325 



1707 1711 
1722 1/26- 
1740 1743- 
1760-1761 
1772-1773 



372 
424 
515 
1327 



379 381 
429 434- 
544 592 
1333 



1424 1622 
1689-1690 



1357 1369 1378 1418 
1646 1649 1680-1681 

1717 17 43-1744 1769 

15-16 26 34 58 61 64 70 75 78 89 
98 105 112 116 120-121 123 133 
151 166 176 180 194-196 198 200 



204-206 
235-236 
277 280-281 
321 329 344 
382 395 408 
435 441-442 
506 509 522 
567 569-570 
658 667 669 
725-726 732 
786 809 817 
873 875 881 
916 954 963 
989 993 995 
1008 



210-211 
239 247 



3 03 
356 
412 
465 
527 



220 
259 
310 
371 
414 
• 466 
534 



225-226 
261 267 



313 
374 
419 
490 



317 
376 
429 
494 



230 
272 
320- 
379- 
434- 
504- 
562 
657- 
717 
784 
872- 
911 
983- 



1070 
1119 

1157 

1250 
1285 



552-553 
572-574 607 631 
672 685-686 702 
748 759 761 778 
829 037 857 861 
889 894-895 909 
967 974 977 986 
997 1000 1005-1006 
1014-1015 1020 1042-1043 
1086-1087 1089-1090 1118- 
1122 1144-1145 1148 1153 
1159 1183 1195-1196 1227 
1257-1258 1262 1267 1280 
1307 1312 1314 



1344-1345 
1363 1403 



1349-1350 
1405 1415 



1317-1320 
1355 1362- 



1426 
1464 
1539 
1583 
1644 
1738 
1779 



1429 1431 
1469-1470 
1549-1550 
1598 1601 
1649 1666 



1442 
1489 



1419 
1446 
1528 



1557-1562 
1611 1615 
1674 1706 



1746 1763-1765 1774 



1425- 

1463- 

1536 

1577 

1622 

1721 

1776 



676 998 1719 



93 133 214 301 35>b J 
581 601 679 837 847 
1236 1270 1313 1324- 
1355 1367 1425-1426 
1733 1760-1761 



74 379 555 
859 1123 
1325 1327 
1536 1690 



"26 37-39 50-51 bB »4 
113 128 131-132 139 
194 198 201 206 211 
261 276 282 286 302 
375 379 383 398 412- 
436 448 452 462-463 
519 529 561 569-570 
607 623 626 635 647 
725-726 730 733 761 
826 837 860 874 913 
970 980 986 988-990 
1001 1007 1014 1027 
1045 1060 1064 1070 



86 89 98 
155 172 186 
230-231 256 
325 359 361 
413 419 430 
473 477 503 
590-591 597 
660 672 715 
775-777 788 
915 921 935 
992 1000- 
1035-1036 
1083 1097 
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Tissue Origin 



RNA Source 



Hyseq 
Library Name 



fetal muscle 



Invitrogen 



fetal skin 



Invitrogen 



FMS002 



FSK001 



SEQ ID NOS 



1099 
1173 
1266 
1324 
1383 
1433 
1557 
1632 
1712 
1766 



1102 
1198 
1270 
1325 
1384 
1505 
•1559 
1644 
1725 



1116- 
1208 
1277 
1329 
1399- 
1514 
15S2 
1650 
•1726 



1117 

1228 

1298 

1336- 

1400 

1542 

1589 

1652 

1743 



1121 
1240 
1317 
1337 
1403 
1551 
1599 
1671 
1744 



1164 
1258 
1320 
1369 
1409 
1554 
1620 
1675 
1754 



119 221 2 
599 736 8 
1431 1440 
1673 1678 
1712-1714 
1743-1744 



73 402 426-427 463 547 
69 1000 1033 1083 1266 
-1441 14S8 1545 1599 
-1679 1687-1688 1710 

1723 1725 1731-1733 

1760-1761 1767 



1 4-11 15-16 20-23 25 29 33 40 
43 46 56-57 60-61 64-66 75 82 87 
97-98 105 107-108 113 118-119 
123 133 125-137 139 144 146 148 



151-153 156 163 
189 197-198 200 
231 246-247 
285-286 290 
321 325 328 



222 
277 
311 
341 
3 62 
388 



180 188- 
210 218 
265-270 
301 307 



351-352 
370 372 
404-405 



345 
368 
394 
419-420 424 
445 448-449 

490 493 504 506 
526 S31 537-540 
567 572-573 581 
615 623 630-631 
657-658 
676 678 
713 
732 



660 
681 
717 
748 



476 
519 
561 
612 
651 
672 

709-710 
728-729 

766 770 775-777 
789 798 809 811 
824-826 831 842 
864 881 894-895 
918 922-923 928 
946 948-949 953 
970 975 977 986 
1000 1004 1007 
1027 1032 
1057-1058 
1072 1077 
1103 1108 
1131 1134 
1153 1156 
1189 1192 
1205 1208 
1220 1222 



1266-1267 
1285 1299 



170 176 
202-203 
261 263 
293 299 

330 333-335 339 
355-356 358-359 
376 379-382 384 
408-409 411-412 
426-427 436 441-442 
454 462 465-466 472 
509 515-517 
547 549 560- 
584 589 611- 
635 647 649 
662-665 667 669 
688 701 704-705 
720-721 725-726 
750 753 759 764 
780-781 786 788- 
814 816-817 822 
857 859 861 863- 
908 910-911 916 
932-933 935 937 
960-961 966-967 
990 992-993 999- 
1013 1018 1025 
1035 1041-1043 
1060 1062-1064 
1090-1091 1097 
1113 1119 1123 
1140 1148-1149 
1163 1167 1178 
1198 
1216 
1243 
1280 



1195-1196 
1211-1212 
1225 1240 
1274 1277 
1310 



1054 
1069 
1099- 
1128 
1152- 
1182 
1201- 
1219- 
1258 
1282- 
1324- 
1346 



1317-1322 
1325 1329-1330 1342 1344 
1349-1351 1354-1357 1365-1366 
1373 1376 1378 1380 
1387 1399-1400 140S 
1429 1431 1433-1435 
1448-1449 
1472 1475 
1490-1491 1493 
1521 1525-1526 1529 



X369 1371 
1383-1384 
1410 1427 
1439-1441 
1468 1470 
1487 
1512 



1454 1457 
1480-1481 
1498 1509 



1536 1547 
1592 1595 
1604 1608 



1549 1557-1559 
1597-1598 1601 
1611 1614 1618 



1535- 
1588 
1603- 
1624- 
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Tissue origin I RNA Source 



Hyseq 
Library Name 



D NOS: 



fetal skin 



Invitrogen 



FSR002 



13 286 302 307 313 321 330 335 
339 341 354 370 372 385 400 402 
408 414 425-427 433 436 450 454 
515 544 585 598 767 810 845 939 
1076 1109 1155 1317-1320 1326 

1347 1350 1369- 
1391 1397 1422 
1678-1679 1687- 
1721 1725 1731- 



1333-1335 1343 
I37i 1377-1378 
1466 1647 1656 
1688 1693 1718 
1732 1739 1755 



fetal spleen 
umbilical cord 



BioChain 



BioChain 



FUC001 



110 137 211 
1639 1771 



1108 



4-8 10 12 14 
64 68-69 75 
114 116 119 
154 157 161 
184 186 192 
215 230 234 
267 271-272 
314 317 321 
356 368 
392 394 
420 424 427 
454 459 461 
486 488 490 
537-540 547 
591 593 606 
645-647 650 
668 674-67S 
703-705 709 
727 732 
777 780 
814-817 
B61 864 
900 903 
933 936 
984 990 
1O07 1016 
1047 10S9 
1089 
1134 
1163 
1216 



17 33-36 44-46 57 
82 85 101 104 113- 
122-124 133 137 153- 
163 166-167 175 
197-198 200-202 
246-247 251 256 
280-281 284 295 
326 333-335 345 
371-373 379-380 386 
406 408-410 412 414 
427 430-436 438 
461 463 467 473 
490 495 504 509 
547 555 561 574 
606 615 620-621 
650 659-660 
684 687 



444-446 
482-483 
524 526 



1243-1244 
1287 1298 
1350 
1381 
1424 
1442 



711 
749-750 
789-791 
822 833 
875 879 
906-907 
940 948 
992 998 
1023 

1061-1063 
1094-1097 
1144-1148 
1171 1197 
1218 
1246 
1316 
1357 1359 
1398 1400 
1427-1428 
1446 



1484-1485 
1505 1513 



1454-1455 
1489 



577 
632 
662-664 
696 698 
714 719-720 
762 765 771 
793 796 802-803 
843 845 848 858 
888 894-895 897- 
911-912 92S 930- 
953 960 966 977 
1000-1001 1005- 
1025 1037 1046- 
1073 1076- 
1112-1113 
1151 1154 
1204-1205 
1234-1235 
1283 1286- 
1344 
1373 
1408 
1433 
1479 



1525 
1565 1567 
1578-1579 
1608 1612 
1636-1637 
1656 1658 
1682 1684 
1709-1710 1722 
1738 1740-1741 



1492-1493 
1527 1536 
1571 1573 
1591 1595 
1615 1621 
1647-1648 
1661-1662 
1686-1688 
1727 1729 
1760-1761 



fetal brain 



GIBCO 



HFB001 



4 9 11-13 17-18 22-23 25 37-39 
42-47 50-51 54-S5 58 60-61 65-66 
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Tissue Origin I RNA Source 



Hyseq 
>ibrary Name 



SEQ ID NOS: 



72 75 77 80 
102 107 110 
123 126 
153-155 
181 186 
208 210 
235-238 
260-262 
286 289 
321-323 325 
349 352 354 
371-372 377 
390 400 408 
434-435 438 
455 457-463 
478 482-483 
496 499-500 



82 85 90-91 94 100- 
112-116 118-119 122- 
128 134 136-140 147-148 
157 161 165 169-172 175 
188-189 197-198 204-206 
215 222-223 225-226 230 



611-612 
636 643 



240-241 247 
267-269 276 
298 300-302 
325 330-331 339 
354 356-359 362 
377 379-380 382 
408 414-416 419 
438 441-443 449 
470 472-473 
486-488 
502-504 
516 519-520 522 
537-540 543-544 
569-570 572-582 
593 595 599 601 

614-620 622-624 
645-647 650-652 
665 667-668 670-672 
687 689 692-694 697 
717 721 727 729-732 
743-746 750-751 759 
772 775-777 784 789 
802-805 810-811 814 
834-837 
862 864 
886-887 
905 
925 



253 256-258 
279-281 284 
307 310 316 
339 341 346- 
362 364-365 
382 384 387 
419 424 431 
449 451 
475 



826 830 
858-860 
879 883 
898-901 



490-491 
506-507 
525-526 
546-547 
585 588 
604 

630 632 

654 659 

676 678 

699 710 

734 736 

763 766 

789 791 796 
814 819-821 

839-850 854< 

869 871 876 

890-891 893 
908-910 912-916 

927 930-933 935 



922-923 

948 952-960 963-964 967 969 
975 978-979 981 983 986-987 
992 995 997 999-1002 1005- 
1011-1013 1016 1018-1019 



1023 1026 
1038 1041 
1059 1064 
1078-1079 
1094 1097 
1115 1121-1122 
1138 1140 1143 
1156-1157 1159 
1193-1194 1200 



1029-1O31 
1047 1050 
1068 1O70 
1081-1082 
1103 



1033-1035 
1053 1057 
1072-1073 
1086 1089 
1107-1109 1113- 
1127 1134-1135 
1148-1151 1153 
1167 1170 117S 
1202 1207-1209 



1211 121C 1219-1220 1226-1227 
1229 1232-1234 1240-1241 1243 
1246 1249-1251 1253 - 1254 * 1258 
1267-1268 1271 1276 1279 1262 
1285-1289 1293-1294 
1308 1312 1316 1320 
1339 1341-1344 1346 
1357 1359 1365-1366 
1379 1386 
1413-1414 
1425-1427 
1442 1445-1452 
1463-1464 1468 
1474 1477-1479 1489 1492 
1497-1498 1501-1503 1507 
1511-1513 1517 1520-1521 
1526 1531-1533 1535 1537-1538, 
1547 15S4 1556-1559 1564-1567 
1571 1584 1587 1589 1594 1599- 
1601 1611-1612 1614-1616 1619- 
1620 1625-1628 1630-1631 1634 
1637-1638 1640-1643 1645 1648- 



1373-1375 
1398 1409 
1420-1421 
1437 1439 
1457 1459 



1305 1307- 
1327 1338- 
1349 1355- 
1369-1370 
1389 1394 
1416-1417 
1430 1433 
1454- 
1470 
1494 
1509 
1524- 
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Tissue Origin 



RMA Source 



Hyseq 
Library Name 



SEQ ID NOS: 



1649 1651 1653-1655 
1664-1665 1667 1669 
1679 1683-1684 1686 
1704-1705 1709 1713 
1720 .1724 1727-1728 
1737-1738 1743-1744 
1755 1757 1760-1761 
1779 1785 



1657-16S8 



1673 
1693 
1714 
1731 
1752 
1765 



1678 

1701 

1717 

1733 

17S4- 

1772 



macrophage 



Invxtrogen 



HMP001 



5-8 110 204-205 503 634 678 859 
878 933 988-989 1379 1448 1504 



infant brain 



Columbia 
University 



IB2002 



10 12-13 15-18 22-23 25 29 34 
37-39 43 47 50-51 54-56 58 60-63 
65-66 68-69 72-74 80 82-83 86 
88-92 97 100 102-104 106-108 110 
112-113 115-116 118 123 128 130 



134-136 138-139 
152 154-155 163 
175 181-184 186 
203-205 209-210 
226 231-232 235-236 
252 257 260 268-269 



143 147-149 151- 
165-167 169 172- 
193-196 198 201 
214-215 222 224- 
239 
272 



279-281 286 288 
300-301 304 307 
330-331 333-334 
352 356-3S7 362 
380 383-384 392 
411 413-414 416 
430-431 434-435 



454 461 
475-476 
494 497 
519-520 
547 550-551 
572-576 579 



291-292 
310 313 
339 346-347 
371-372 377 
397 401 
418-419 
438 443 
469-470 
487 



246-247 
276-277 
295 298 
321-323 
349 
379- 
406 408 
422 428 
449 453- 
472-473 
490 492 



464-466 
478 482-483 

503 507-508 510-513 516 
524-526 530-534 536-540 
561 563-564 566-567 
581-582 584-587 590- 



591 593 
616-617 
641 645-647 
665 667-675 
703 707 713-715 
733-736 739 743 
763 769-770 772 
788-789 793-794 
814 825-826 830 
845 848-850 
865 870 872 



595-597 607-609 611-613 
620 622-524 627 631 637 



650-655 657-658 660- 
689 691 695 697 699 
717 721 728-731 
745 751 755 7S9 
778 780-781 785 
799 803 808 811 
834-836 840-843 
854-855 860 862 864- 
875-876 878 886 888 
890-891 894-896 898 903-904 916- 
917 919 922-925 927-928 930-932 
934-936 938 941 945-946 948-950 
953-954 959-962 966-969 977 979 
981 906-990 992 997 999-1000 
1004-1006 1014 1016 1018-1019 
1024-1025 1033 1036 1047 1051- 
1052 1054-1055 1057-1059 1063- 
1064 1068-1070 1073 1081-10S2 
1085 1089 1108-1113 1118-1120 
1123-1124 1130 1132-1138 1140 
1149 1151 1153-1154 1163-1170 
1172 1174-1175 1183-1184 1188 

1196-1197 1199 
1211 1218-1222 
1231 1234 1241 
12S6 1258 1261- 
1279 1281 1283 
1294-1295 1305 
1316-1320 1329 
134S 1349 1356 
1368-1370 
1388 1400 



1190 1193-1194 
1204 1208-1209 
1226-1227 1229 
1247 1249 1251 
1262 1269 1274 
1285 1287-1289 
1307 1313-1314 
1332 1341-1342 
1362-1363 
1374 1381 



1365-1366 
1383-1384 



1403 1406-1407 1413 1417 1420 
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Tissue Origin 



RNA Source 



Hyseq 
Library Name 



infant brain 



SEQ ID NOS 



1423 

1441 

1454 

1468 

1483 

1499 

1522- 

1542 

1555 

1580 

1593 

1610 

1624 

1639- 

1654- 

1672- 

1693- 

1717- 

1733 

1755- 

1777- 



1429- 
1443 
•1455 
1470- 
1485 
1502- 
1523 
1546- 
1563 
1583- 
1595 
1612 
1626- 
1640 
16S5 
1673 
1695 
1720 
1735- 
1758 
1778 



1431 

1447- 

1457 

1471 

1493- 

1503 

1525 

1547 

1565- 

1586 

1598 

1614- 

1627 

1642 

1658- 

1676- 

1701- 

1723- 

1741 

1762 

1786 



1435 

1449 

14S9 

1475 

1494 

1505- 

1528 

1549- 

1567 

1588 

1600- 

1616 

1630- 

1644 

1659 

1681 

1702 

1724 

1743- 

1765 



1436 

1451 

1463 

1479 

1496 

1507 

1531 

1550 

1569 

1590 

1601 

1619 

1633 

1647 

1664 

1685 

1704 

1726 

1744 

1771 



1439 
-1452 
-1465 
1482 
1490 
1509 
-1533 
1554- 
1575 
1592- 
1608- 
1621 
1637 
1652 
-1665 
-1688 
1708 
-1728 
1752 
1774 



Columbia 
University 



IB2003 



infant brain 



infant brain 



Columbia 
University 



Columbia 
university 



IBM002 



310 322 
349-350 
384 403 



472 
520 
576 
601 
650 
675 
73 4 



476 
530 
585 
606 



IBS001 



17-18 20-23 29 34 43 60 68-69 
78-80 88 100-101 107 110 112 118 
123 128 133 135-137 146 148 152 
159 166 169 174 194 198 203 215 
223 225-226 229 235-236 247 260 
276-281 286 290-292 295- 300-301 
324 331 334 339 346-347 
352 357 371 376-377 382 
408-409 414-415 453-455 
478-479 490 503 507 516 
534 536-540 551 563 572- 
587 590-591 593 595-596 
612 616-617 620 622-624 
652-653 661 665 670-671 674- 
678 609 715 717 727-728 730 
759 775-777 780-781 785 796 
806-807 811 824 845-846 864 869 
875 882 889 894-895 898 904 917 
919 921-923 932 935-936 946 950 
954 962 977 979 997 999-1000 
1005-1006 1009 1011 1017 1024 
1033 1037 1043 1055 1057 
1114-1115 1120 1123 1127 
1145 1149 1151-1153 1160 
1170 1174 1193-1194 1196 
1202 1206 1209 1220-1221 

1251 1258 
1314 1327 
1356-1357 
1388 1400 
1436 
1459 
153S 
1587 
1631 
1673 



1229 1240-1241 
1288-1289 1305 
1344 1347 1350 
1366 1378-1379 
1421 1423 1431 
1446-1447 
1503 1507 
1559 1567 
1610-1612 
1647 1657-1658 
1683-1684 1701-1702 
1713-1714 1719 1757 
1765 1771 1778 



1457 
1509 
1572 
1615 



1109 
1144- 
1167 
1199 
1226 
1284 
1333 
1365- 
1403 
1440-1441 
1471 1499 
1546 1557- 
1595 1598 
1639 1644 
1678-1681 
1708-1709 
1760-1761 



101 113 139 152 260 279 290-292 
374 377 551 563 608-609 653 659 
814 954 1005-1006 1029-1030 1130 
1164 1209 1258 1294 1305 1320 
1327 1397 1431 1498 1507 1615 
1640 1694-1695 1763-1764 1767 
1779 



10 12 119 175 279-281 321 334 
371 446 551 563 623 652 667 669 
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Tissue Origin 



lung, 
fibroblast 



RNA Source 



Strategene 



Hyseq 
l*ibrary Name 



SEQ ID NOS: 



LFBOOl 



lung tumor 



mvitrogen 



LGT002 



"671-672 819 949 966 
1151 1188 1193-1194 
1258 1265 1271 1287 
1324-1325 1342 1423 
1448 1471 1482 1525 
1562 1569 1588 1591 
1647 1649 1658 



1113 1130 
1196 1229 
1317-1319 
1440-1441 
1532 1546 
1610 1618 



17 20-21 25 
157 197-198 
262 266 
370 427 
498 503 
542-544 
607 615 
712 
810 
876 



S-9 
153 

213 223 
333 356 
472 493 
537-540 
599-600 

692-694 712 719 
794-796 610 837 
856 869 876 903 
964 975-976 984 
1024-1025 1033 
1070 1072 10B2 



105 



233 
430 
516 
562 
630 
74 5 



212- 
326 
446 462 
527 535 
567 586 
662-664 
775-777 
849 854 
955-956 



1136-1138 
1233 1246 
1320 
1446 
1552 
1620 
16S5 
1690 



1140 
1279 
1334-1335 
1478 1482 



155S 1567 
1625 1632 
1662 1680 
1696 1702 
1760-1761 1778 



63-69 82 94 
203 207-208 
302 321 
436 
519 
565 
647 
748 
843-847 
934 953 
1000 1005-1007 
1039 1053 1064 
1112-1113 1134 
1195 1223 1232- 
1295 1311 
1427-1428 
1S04 1537 
1582 1598 
1645 1654- 
1684 1686 
1733 1741 



1285 
1343 
1493 
1575 
1638 
♦1681 
1711 
1785 



5-10 18 20-21 29 33-36 40 43 52 
54-55 61 65-66 68-70 73-75 80 85 
88-89 93-94 100 103 106-108 112- 
113 115-116 118-113 123-124 126 
130-132 135-137 139-141 143-144 
147-148 151-153 155-15S 159 161 
164 169 171 179-180 185 190 192 
194 196-199 203-208 210 212-214 
216-217 219 222 233 240-241 244 
246 251-252 255-256 261-262 266. 
272 276-277 279-281 284 286 288 
290 295 298 301-302 309-312 317 
321 329 332 341-342 344-345 348 
352 358-360 363 368 370-371 376 
380-3B1 384 389-390 398 400 409 
414 423 426-427 430 432-436 443- 
444 450-451 454 462 468 472-477 
480-483 487-488 490-491 493 496- 
498 500 503-506 509-512 515-516 
519 521-523 526 530 534 541 544 
547 554 557 564 566-567 S72-576 
585-586 588-589 595-596 601 607 
611-612 615 619 621 623 626 630 
632-633 644 647 649 651 655-656 
660 662-665 667 659 672 683-684 
696 700 706 710 713 716 718-719 
722-723 728 734-739 743 750 752 
763 765-766 773-778 784-785 787- 
789 791 800 802-803 809-812 814 
824 826 828-829 832 838-839 841- 
845 849-850 852-855 857-861 864 
866 874 878-880 882 887 890-891 
897-898 902 904 906-907 910 916 
918-920 922 924-925 927 930-932 
934-935 937 947 950 953 9S5-956 
961 963 966-967 969 971 977-979 
981 984 986-987 990 992-993 995 
997 999-1001 1005-1007 1009 
1012-1013 1018 1020 1022-1024 
1026 1029-1030 1033 1038 1041 
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Tissue Oiriciin 

^» ^» fcrf * * ^ ■ ^^— * " 


RNA Source 


Hyseq 
Library Name 






SEQ 


ID NOS: 

• 










1045 


1047- 


1050 


l 0*52 


1054 

A W ^ » 


-1055 








1059 


1063- 


1064 


1 067- 


1071 


1073- 








1074 


1078 


1085 


1087 


1089 


1095- 








1037 


1104 


1106- 


1107 

i A V r 


1109 


1112 








1116 


-1117 


1119 


1126 


1134 


-113S 








1139 


1141- 


1142 


1144 - 


1145 


1148 








1152 


-1153 


1156- 


1158 


1167 


1170 








1172 


1178 


1195- 


1196 


1198 


-1200 








1202 


1204 


1208 


1214 


1216 


1219 








1222 


1227 


1234 


1241 


1247 


1252 








1257 


-1258 


1265 


1267- 


1270 


1276 








1278 


1280- 


1281 


1283 


1285 


1288- 








1289 


1295 


1300 


1305 

J U J 


1308 

1 J V U 


1312 








1317 


-1321 


1329 


X J J O 


1339 
x j j ^ 


1341 




■ 




1344 


-1345 


1349- 


X J 3 J. 


X J JO 


-1355 




■ 




1357 


1365- 


1366 


1 ICQ 
X -3 O 7 


1j / D 


-1379 








13 83 


-1385 


1394 


X ,3 J 1 


•t inn 
x 4 uu 


1402- 








1403 


1408 


1417 






-1426 








1431 


1433- 


1436 


*J o 

1438 


1444 


1446 - 












1455 


14 60 


14 DO 


1468 










1 4 74 
x *» / *» 


14 80 - 


1481 


1483 


1486- 










X J w 


14 91 

-L *a ^ -L- 


1494 - 


149& 


1506 








X 3 V O 


X ? w 7 


1511- 


1512 


Ibis 


-1516 








T en a 


1 533- 


X J <. "x 


1528- 


1529 


1536- 








J. J*i VJ 


T 54 K 


1 54 9- 

X _J *x J 


1550 


1555 


1560- 










J. -> O w> 


J.O D / 


1569 


1575 


1588 










1593- 


1594 


1596- 


1598 


1600- 


* 






1602 


1608 


1614- 


1616 


1618 


1620 








1624 


-1625 


1627- 


1632 


1636 


1639 








1644 


-1645 


1647- 


1649 


1652 


-1653 








1656 


-1662 


1664 


1666- 


•1657 


1670- 








1671 


1673- 


•1675 


1678- 


1679 


1683 








1685 


-1688 


1690- 


1692 


1696 


-1699 








1705 


1709 


1716- 


1717 


1722 


1727 




• 




1730 


1735 


1739 


1741 


1743 


-1744 








1748 


-1749 


1753 


1760-1762 


1765 








1767 


1770-1771 


1773 


1775 


-1776 




• 




1778 


-1779 


1786 








-- 

lymphocytes 


-ft rprV» 


Ijx'VUUx 


4 11 


-12 18 24-25 30- 


•31 4 


8 50-51 


1 




56-57 68-69 80 


92 98 103 


105 110 








126 


137 152-153 


157 


165 


172 188- 








189 


197 203 210 217- 


•218 


222-223 




: 




225- 


226 229 231 


247 


251 


256 264 








272 


280-281 284 


300- 


•301 


321 325- 








326 


339 348 352 


3 57 


371 


382 384 








390 


400 404 412 


414 


421 


423 426- 


■ 






427 


430-431 445 447- 


-448 


451 454- 








455 


4 75 503 516 


526- 


•527 


530 53 7- 








540 


549 556-56C 


I 563 


574 


577 589 








602 


613 615-617 


' 621 


623 


628-630 








636- 


637 647 649 657- 


-659 


690 697 








717 


723 755 764 


775- 


-777 


780 786 








789- 


790 793 80C 


i 802 


822 


838 849 








866 


869 876 881-883 


892 


898 906- 








907 


911 921-923 


928 


975 


990 992 








996 


1001 1004-1007 1033 


1050 








1054 


1078 


1107 


1135 


1140 


-1141 








1143 


114 8 


1158 


1163 


3177 


1199 








1205 


1216 


1226 


1231 


1236 


1241 








1244 


1250 


1258 


1260 


1265 


1269- 








1271 


1290- 


•1293 


1308 


1312 


1317 








1319 


-1320 


1339 


1345- 


-1346 


1348 








1350 


-1351 


1357 


1367 


1369 


1379 








1381 


1383- 


•1384 


1386- 


-1387 


1389 








1394 


1397 


1405 


1423 


1425 


-1428 




i 




1431 


1437 


1446 


1448 


1461 


1466 








1470 


1472 


1474 


1482 


1492 


1506 








1528 


1537 


1546 


1549 


1591 


1598 








1600 


1603- 


•1604 


1606 


1627 


1636 
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Tissue Origin 



RNA Source 



leukocyte 



GIBCO 



Hyseq 
Library Name 



,UC001 



X638 1647-1649 1651 1658-1659 
1664 1676-1677 1680-1681 1687- 
1688 1699 1711 1715-1716 1726 
1728 1737 1740 1746 1748 1752 
1756 1758 1777 1779 



212-215 
236 247 
274-277 
307-310 



3-4 10-11 13 15-18 20-^1 J.*-2.-=> 
30-31 35-36 40 43-45 48 50-51 
54-58 60-63 68-69 75 79-80 82-83 
85 88-91 93-96 98 100 103-104 
107-108 112 116 119 123 125-128 
134-140 142 147-149 151 153 155 
157 162-163 167 169-172 174 
179 186 190 192-199 203-207 
217-219 222-223 229 
251 255-258 260 262 272 
280-281 285-286 297-301 
313-314 316-317 321 325- 
330 333-334 340-342 348-349 352 
354-358 370-371 380-385 387-388 
400 405 408-410 412 414-416 421- 
425 430-431 434-435 437 439 441- 

4S3-454 456 459 461- 
474-479 481 483-485 
499-501 503-504 509- 
522 526-527 529-531 
542 547-549 553-559 
574-S77 579 582 584- 
595-597 601-602 604 
606-607 611-613 615-621 623 627- 
629 633 636-637 642 644-650 655 
659-660 662-665 667 669 674-675 
678 682-684 692-696 698 
708 710 716-720 725-726 



442 445-451 
464 468-472 
487-491 496 
513 516-519 
534 536-S40 
566-567 571 
586 589 593 



749 751 
770-773 
796 793 
817 819 
838 843 
877-879 
904-914 
935-936 
9S5-956 



700 706 
729-736 
753 756 
780 784- 
800 802- 
826 828- 
845-860 
881-892 
916 919- 
941-942 
958 960- 
975 977 



738-739 743-746 
759 765-766 768 
786 788-790 793 
803 810-811 814 
830 832 834-836 
863-864 866-871 
894-896 898 902 
925 327 930-932 
945 948-949 953 
962 964 967 970-971 973 
985-990 992-993 995-996 
1004-1009 1011 1014 
1022-1023 1025 
1033-1036 1038 
1050 1053-1054 
1062 1064 1068 
1085-1086 :i089-1091 
1106-1107 1110-1113 
1122-1123 1125 1129 
1135-1137 1140-1145 
1163 1168 1170-1174 
1180 1182-1183 1186 
1200 1202 1205-1206 
1219-1221 1223-1227 
1238-1242 1247 1252 
1258 1261-1262 1264-1265 1269- 
1270 1272-1275 1277 1280-1284 
12B7-1293 1299-1300 1306 1308 
1312-1313 1317-1320 1322 1324- 
1330 1333-1335 1339 1341 1343- 
1347 1349 1353-1357 1359-1361 
1365-1367 1369-1370 1373-1374 
1377 1379-1381 1386-1387 1394 
1400 1403 1409 1419 1423 1425- 
1428 1430-1431 1433-1434 1437- 
140-1442 1446-1448 1450 



999-1002 
1017-1019 
1027 1029-1031 
1041 1043 1047 
1058-1059 1061- 
1070 1072 1078 
1093 1097 
1115-1117 
1132-1133 
1152 1158 
1176-1178 
1195 1198- 
1211 1216 
1230-1236 
1254 1256 



1438 14' 
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Tissue Origin j RNA Source 



Hyseq 
Library Name 



leukocyte 



LUC003 



melanoma from 
cell line ATCC 
fJCRL 1424 



MEL004 



mammary giar.a rinvitrogen 



MMG001 



SEQ ID NOS 



4 35-36 44-4S 6l 68-69 7b 1^ 
119 139 154 179 197 244 280-281 
324 372 404 430-431 45S 461 476- 
477 481 503 537-540 554 575-576 
581 589 608-609 621-622 624 630 
632 647 662-664 669 679 698 764 
773 775-777 802 848 851 856-857 
879 905-907 915 949 952 990 992 
1002 1113 1119 1170 1183 1216 
1236-1237 1241 1275 1346 1353 
1357 1359 1377 1506 1515 1534 
1553 1591 1600 1613-1614 1621 
1628 1670 1676-1677 1691-1692 
1699 1733 1738 1772 



25 35-36 43 80 104 126 128 150 
163 166 188-189 197 210 215 220 
271 277 280-281 310 317 336-338 
34S 3S1 372 380-381 383 387 412 
415-416 430 445 448 454 456 467 
481 490 499 503 526 528 546 548 
567 575-576 588 601 613 615 647 
660 665 734-735 737 759 778 787 
790 800 832 845 856 859 869 878 
883 887 905 914 932 934 958 976 
985 990 992 999-1000 1025 1031 
1038 1050 1055 1068 1074 1088 
1099-1102 1107 1136-1138 1149 
1156 1163 1172 1190 1195 1200 
1214-1215 1217 1226-1227 1235 
1238-1239 1244 1253 1278 1230 
1293 1311 1320 1330 1334-1335 
1345 1355 1367 1386-1387 1394 
1403 1406 1414 1423 1437 1442 
1465 1521 1529 1536 1539 1541 
1547-1548 1582 1620 1626 1631 
1638 1647 1653 1660 1667 1669- 
1670 1680-1681 1696 1704 1715 
1724-1725 1731-1732 1750 1760- 
1761 



5-8 10 12 14-18 20-21 24-*t> 
33-39 42-43 52 55-58 60-64 68-69 
71 73-74 79-80 82 89 98 100 103 
106 108 112 123 128 133-137 144- 
146 148 150-152 154 158-159 165- 
166 170-172 174 176 178 181-185 
188-190 194-198 201-206 210 217- 
222 224 227-228 231 233-237 247 
251 253-254 256 261-263 266-267 
271 276-277 279-281 284-286 288 
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Tissue Origin j RNA Source 



Hyseq 
Library Name 



SEQ ID NOS: 



290 297 
320-321 
334 339 
359-360 
303 380 
4C8 412 
441-444 
476 479 
4S5 498 
519-520 522 
.54 7 549 554 



299 1301 304 
323-325 327 
341 344-34S 
362^363 368 
390 393-395 
414-415 423 
448 451-455 
482 485-486 
503 506 509 
527 
557 



induced neuron j Strategene 
cells 



NTD001 



529 
562 

607 

644 
660 
682 
713 717 
738 743 
766 770 
806-807 
837 842 
869-870 872 
904 906-907 



589-S91 597 602 
629 632 634-640 
652 655 657-658 
672 674-676 679 
706-707 710 
732-734 73$ 
755 759 761 
789 794 003 
822 827-829 
•864 866 
893-900 

921-923 926 935-937 
953-954 957 960-961 
970 977-978 984-989 
1000-1001 1005-10C6 
1014 1016-1017 1023 
1032-1033 1036 1039 
1055 1057-1058 1063 
1077-1078 1085 1087 



1095-1102 
1121-1123 
1139-1142 
1153 1159 
1183-1185 
1207-1208 
1223 1225 



1107-1108 
1131-1133 
1144-1145 
1167 1170 
1190-1192 
1212 1216 



1231 1234 
1247 1253-1254 1258- 
1262 1270-1280 1283 
1298 1307 1314 1316- 
1325 1330 1334-1335 
1354-1355 
1379 1381 
1414 1419 
1428-1429 
1448-1449 
1466 1471 
1489-1491 1493 
1519 1526-1528 



1349-1352 
1370 1377 
1389 1405 
1425-1426 
1437 1439 
1460-1464 
1487 
1512 



1536 1539 1542 1547 
1554 1561-1562 1564 
1576-1579 1581-1592 



1592 1594 
1607-1608 
1621-1622 
1636 1641 
1652 
1662 
1674 
1692 
1720 
1740 
1751 
1771 
1784 



1596-1597 
1610 1612 
1625-1626 
1643-1644 
1654-1655 1657 
1664-1666 
1676-1677 
1701' 1706 
1723-1728 
1742-1744 
1753 
1774 
1786 



1669 
1680 
1713 
1730 
1746 



309-312 318 
329 331-332 
348 350 356 
371 376 379- 
397-398 405 
430 434-437 
462-464 474 
488 490 494- 
-512 516-517 
534 537-541 
572-574 587 
618 623 628- 
647-648 650- 
665 667 669- 
688 695-696 
720 722-730 
747-748 750 
780 784 706- 
809 814 817- 
854-858 863- 
878 881 889 
911 916 919 
946 948-949 
963 965-966 
993-997 
1008 1013- 
1025 1027 
1043 1045 
1068-1075 
1089-1091 
1112-1119 
1136-1137 
1148-1149 
1172-1173 
1196-1199 
1218 1222- 
1240-1241 
1259 1261- 
1285-1286 
•1320 1323- 
1342-1345 
1359 1369- 
1383-1384 
1421-1423 
1431 1434- 
1454 1457 
1480-1483 
1505 1507 
1532 1534 
1549-1550 
1567 1572 
1587-1588 
1601-1602 
-1616 1618 
1631 1635- 
1647 1650 
-1658 1660 
1671 1673- 
1685 1689- 
► 1715 1719- 
•1732 1738 
-1747 1749 



1760-1762 
1776-1777 



1765-176B 
1779 1783- 



29 35-36 80 116 123 
214 230 280-281 284 
330 340 358 371 375 



422 424 492 497 532-533 



156 163 181 
285 307 321 
377 380 382 
542 546 
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Tissue Origin 



retinoid acid 

induced 
neuronal cells 



neuronal cells 



i RNA Source 



Hyseq 
Library Name 



SEQ ID NOS: 



54 9 56 6 586 595 612 
734 775-778 780 792 
856 858 875 936 953 
1041-1043 1055 1072 
1194 1206 1223 1246 
1288-1289 1291 1294 
1349 1359 1412 1423 
1623 1645 1684 1705 



645-647 654 
799 821 826 
985 990 992 
1104 1193- 
1253 1274 
1311 1320 
1485 1620 
1715 1751 



Strategene 



NTR001 



5-B 78 268-269 277 383 431 506 
623 677 731 999-1000 1199 1425- 
1426 1547 



Strategene 



NTC7001 



29 65-66 80 82 110 119 146 152 
166 174 181-185 198 227-228 253 
309 325 332 334 336-338 375 
393 406 414-416 454 
488 503 506 510-512 519 537- 
572-574 597 602 607 623 647 
700 702 716 743 771 792 858 
948 954 977 1000 1005-1006 
1064 1068 1122 1148 1185 



284 
391 
470 
540 
661 
904 
102S 
1219 1226 
1295-1296 
1330 1350 
1383-1384 
1539 1547 
1690 1738 



1234 1246 1271 1283 

1311 1317-1320 1329- 

1355 1365-1366 1378 

1400 1412 1445 1505 

1578 1647 1656 1683 

1749 1783-1784 



pxtuitary 
gland 



Clontech 



PIT004 



311 314 379 408 419 
1095-1096 1272-1273 
1378 1652 1671 1720 
1741 1755 



430 454 1055 
1312 1320 
1725 1736 



placenta 



Clontech 



PLA003 



prostate 



5-8 124 208 277 370 
1280 1317-1319 1359 
1737 



843 906-907 
1609 1621 



Clontech 



PRT001 



9 46 57 
201 229 
307 310 
400 430 
489 497 
531-533 
664 710 
871 874 



71 107 147 171 177 197 
231 242-243 274 280-281 
317 330 358 373 382-383 
434-436 461-462 469 477 
500 505-506 513 521 526 
547 618 649 657-658 662- 
729 767 771 789 820 861 
890-891 905 938 945 963- 
964 9B8-989 1002 1025 1033 1045 
1061 1095-1096 1112 1125 1142 



1196 1198 1202 
12S8 1272-1273 

1333 1341 1344 

1363 1367 1437 

1478-1479 1482 

1527 1531 1536 
1636 1657 
1717 1738 



1232-1233 
1287 1295 
1349 1360 
1442 1447 
1489 1513 
1598-1599 



1241 

1313 

1362- 

1475 

1517 

1628 



1680-1681 
1743-1744 



1687-1688 



rectum 



Invitrogen 



REC001 



17-18 29 33 
113 126 146 
200 206 261 
373 388 395 
442 446 448 
540 547 567 
629 632 
717-719 721 
756 762-763 
825 843 849 
949 960 986 



62-63 71 73-74 83 86 
153 158 167-169 195 
309 312 341 344 368 
408 414 420 430 441- 
464 468 483 517 537- 
585 589 602 623 628- 
645-647 651 657-658 669 

738 748 750 
774 790 819 
903 909 948- 



725-726 
766 770 
851 881 
996 1020 



1034 1064 1067 
1108-1109 1113 
1139 1172 1178 
1205 1220 122S 
1317-1320 1323 
1351 1355 1369 



1070 
1130 
1185 
1240 
1334-133S 
1373 1375 



1023 1033- 
1075 1086 
1139 1153 
1187-1189 
1244 1271 



1350- 
1425- 
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Tissue Origin RNA Source 



Hyseq 
Library Name 



salivary gland [ Clontech 



SAL001 



SEQ ID NOS: 



1426 1436^ 
1482 1546 
1610 1622 
1665-1666 
1786 



1439 1469 1474 1477 
1587-1588 1592 1596 
1627 1644 1658 1662 
1669 1675-1677 1749 



97 103 110 140 149 152 158 



salivary gland 1 Clontech 



SALS 03 



10 55 

198 217-718 242-243 256 
312 321 333 351 354 360 
448 473 487^ 494 496 501 
569-570 572-573 590-5S1 
6S1 759 762 764 768 771 
809 826 848^ 865 
933 963 1016 1020 1025 1040 



301 
410 
S35 
624 
788 
879 906-907 



308 
437 
5S5 
636 
800 
925 
1046 



1055 
1234 
1315 
1359 
1474 



1066 1103 
1281-1282 
1320 1333 



1373 
1482 



1523-1524 
1627 1636 
1671-1672 



13 79 
1492 
153 7 



1652-1655 
1691-1692 



1150 1172 
1288-1289 
1336-1337 
1424 1447 
1494 1498 
1554 1596 
1658 



1181 

1298 

1346 

1449 

1511 

1626- 

1665 



158 326 1423 1463-1464 



skin 
Lbroblast 



stein 
fibroblast 



skin 
fibroblast 



small 
intestine 



skeletal" 
muscle 




skeletal 

muscle 
skeletal 



muscle 
sxcieta. 
muscle 
spinal cord 



Clontech 



Clontech 
Clontech 
"Clontech 



SKM002 



SKMs03 
SKMS04 
SPC001 



1638-1639 
X671 1675 
1711 1717 
1729 1733-1734 
1767 17 80 1785 
18 20-21 82 
151 153 166 
289 329 361 412 414 
459 470 488 503-504 
660 673-675 715 773 
905 922 950 963 982 
1047 1063 1115-1117 
1228 1268 .1284 1298 
1336-1337 1343 1409 
150 9 1599 1624 1644 
168 1683 1712 



235-236 1409 



235-236 



4 9 11 17 30-31 35-36 43 46 60 
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Tissue Origin 



RNA Source 



Hyseq 
Library Mame 



SEQ ID NOS 



82 8S 92 94 108 
167 198 204-205 



110 
210 
300 
392 
473 
526 
567 
625 



259 277 280-281 
317 372 379 387 
430 433 448 467 
509 513 519 524 
547 S49 551 559 
607 516-617 623 
652 657-658 670-671 
682 709 711 715 719 
749-750 753 775-777 
809 820 832 834-836 
855 858 861 864 871 
898 906-908 917 919 
944 970 985 990 992 
1039 1053 1059 1065 
1077 1082 1085 1097 
1116-1117 1128 1134 
1174 1192-1194 1215 
1243 1283 1294 1307 
1323 1327 1330 1350 
1356 1359 1368 1375 
1407 1423 1429 1437 
1454 1470 1482 1492 
1511 1529 1538 1548- 
1571 1578 1598 1600 
1627 1630 1639 1646 
1670 1686 1696 1740 
1771 



116 139 157 
215 229 256 

-302 304 315 
419 426-427 
487 489 506 
537-540 543 
569-570 5S3 
637 649-650 
673 679 681- 
728-729 734 
781 789 791 
847-849 854- 

-872 875 884 
924 934 942 

■993 998 1013 
1072 



1103 
1151 
1225 
1312 



1075 
1109 
1170 
1241 

1320 



1353-1354 
1400 1406- 
1443 144B 
1501 1508 
1549 1565 
1614 1625 
1651-1652 
1751 1755 



adult spleen 



CI on teen 



SPLcOl 



117 312 326 348 424 426-427 431 
845 866 1320 1330 1333 1344 
1355-13S7 1371 1387 1397 1446 
1538 1579 1669 1686 1739 1767 



stomach. 



Clontecb 



STO001 



10 15-16 61 68-69 100 117 149 



thalamus 



Clontech 



197 201 
281 287 
426-427 
597 620 
780 782 
967 976 
1071 1135 
1259 1277 
1359 1369 
1487 1493 
1634 1651 



227-228 231 249 273 2B0- 
291-292 302 312 358 362 
430 446 462 475 479 535 
630 651 662-664 722 739 
785 846 919 960 964 966- 
1008 1012 1032 1042 1063 
1170 1208 1234-1235 
1280-1281 1322 1349 
1449 1468 1474 1478 
1498 1557-1559 1622 
1653 1729 



THA002 



9 11 25 
190 198 
239 261 
333-334 
388 393 
477 483 
608-609 
774 782 
899-900 
1034 1055 
1150-11S1 
1193-1194 
1305 1345 
1440-1441 
1562 1572 
1614 1640 
1688 1703 
1753 



85 87 112 137 146 180 
206 210 212-213 235-236 
268-269 279 290 301 325 



341 
3 96 
508 
647 
784 
961 



351 356 
419-420 
525 531 
681 715 
794 827 



364-365 379 
441-442 458 
549 567 606 
725-727 736 
883 890-891 
997 999-1001 1004 
1097 1129 1144-1145 
1157 1172-1173 1177 
1208 1220 1249 1280 
1434-1435 
1546 1549 
1594 1613- 
1671 1687- 
1746-1747 



1355 1369 
1454 1496 
1578 1590 
1651-1652 
1743-1744 



thymus 



Clontech 



THM001 



44-45 54 57-58 62-64 79 104 123 



126 134 
243 258 
327 330 
430 445 
493 503 



153 193 
274 277 
333 342 
465-466 
506 509 



212-213 218 242 
279 297 301 307 
351 358 371 410 
468 471 483 487 
517 526 535 537 
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Tissue Origin 



thymus 



RNA Source 



CI once ch 



Hyseq 
Library Name 



THMC02 



SEQ ID KOS 



540545 
591 604 
649 656 
728 735 
775-777 
624 826 



548 S54 567 584 586 590- 
6X2 621 638-640 645-647 
660 655 670 698 710 720 
739 746 759 762 766-767 
780 784-785 800 802 809 
828 845 851 858-859 864 
866 870-871 878 884 887 892 899- 
900 927 930-931 967 983 986 990 
992 999 1014 1029-1030 1033 1059 
1066 1073 1103 1107 1113 1116- 
1119 1140-1142 1158 1163 
1177 1195 1206 1209 1213 
1218-1219 1221-1222 1227 
1277 1282 1320 1329 1349 
1369 1383-1384 1417 1419 
1425-1427 1448 1477 1488 
1536 1554 1620 1644 1646 
1654-1655 1661-1662 1669- 



1117 
1172 
1216 
1271 
1367 
1423 
1493 
1549 
1670 
1707 



1674 
1711 



1676-1677 
1731-1732 



1685-1688 
1737 



5-9 15-21 25 33 35-36 43-45 48 
50-51 54-55 60 75 S3 87 89 93 
98-100 102 105 112 117 135-137 
141 143 146 157 167 169 192 196 



211 217-219 222 
236 240-241 244 
262 268-269 286 
301-302 309-310 
327 334 342 350 
373 382 384 400 
424 430-431 436 
464-467 470 472 



224 229 
251-252 
288 290 
315-317 
352-353 
403 410 
445 454-456 
474-476 483 



233 235- 
256 261- 
295 297 
321 324 
360 370- 
414-416 
461 
488 



513 516 519-520 
534 537-540 549 
569-570 572-573 
595 603-604 606 
636 647 650 
673-675 678 
725-726 731 



800 810 
854-856 
890-891 
941 94 9 
981 986 



497 500 504 506 
524 526 530-531 
554-555 565-566 
575-577 586-587 
612 630-632 634 
660 666-667 669 
700 703 708 720 
739 743-744 750-753 757 759 
765 767 772-779 787 789-790 
823 829 834-836 841 
859 861 864 870-871 
898 908-909 913 928 
9S8 961 963 967 969 
988-990 992 999 1007- 
1016 1039 1041 1073- 
1089 1097 1109 
1131 1140-1141 
1172 1175-1177 
1206 1211 1216 
1234-1243 
1280-1281 
1317-1320 1322 
1330 1334-1335 
1350-1351 1355 1357 
1374 1377-1379 1386 
1397 1400 1402 
1425-1427 
1477 1483 
1525 1536 
1598-1600 
1623 1625 



657- 
698 
738- 
763- 
798 
848 
881 
933 
975 



1008 
1074 
1117 
1145 
1196 
1223 
1267 
1308 
1327 



1014 
1079 
1122 
1163 
1198 
1227 
1271 



1392 
1417 
1466 
1504 
1S66 
1614 
1641 
1658 
1681 
1711 
1733 
1761 



1423 
1474 
1506 
1S94 
1621 
1644 



1647 



1662-1663 
1686-1688 
1717-1718 
1737-1738 
1771-1772 



1649 
1671 
1693 



1726-1727 
1743-1745 
1779 1786 



1114- 
1144- 
1186 
1220 
1261-1262 
1284 1290 
1324-1325 
1339 1346 
1360 1370 
1389-1390 
1406-1407 
1440-1441 
1493 1498 
1545 1549 
1608 1611 
1632 1639 
1653-1656 
1673 1678- 
1705 1707 
1731- 
1758- 
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Tissue Origin 


RNA Source 


Hyseq 






SEQ 


ID NOS : 








Library Name 














thyroid gland 


Clontech 


THR001 


4 9- 


10 20 


-21 3 


7-39 


48 50-51 54- 








57 60-61 


S5-66 


71 83 94- 


96 98- 








100 


102 104 IK 


0 112 


115- 


117 119 








123 


127 133 13) 


5-137 


140 


149 152- 








153 


155-158 163-164 


1G8- 


169 171 








186 


190-192 19* 


7 201 


-203 


219-220 








22 9 


233-237 246-247 


253 


256 258 








262 


265-266 268-269 


277 


280-281 








284- 


286 288-289 298 


-299 


302 309- 








311 


317 321 326 332 


335 


341-342 








344 


348 350 354 358 


-359 


363 368 








371- 


373 382-383 385 


394 


398 400- 








4 01 


411 414-415 421 


424 


430-431 








433- 


436 443-446 450 


-452 


454-455 








458 


472-474 476-478 


482 


484-485 








487- 


488 490-494 496 


-497 


500-501 








503- 


504 506 509-513 


516- 


517 519 








524 


526-527 529 535 


-540 


547 549 








562 


564 569-570 575 


-576 


588 594- 








595 


601-602 604 606 


610 


612 615- 








617 


619-623 628-630 


634- 


635 642 








647 


649-6S1 660 662-665 


668 670 








681 


690-694 696 698 


I UU 


709 721 








727- 


729 732 734 738 


/H U — 


741 743 








745 


750 759 761 763 




770 773 




• 




780 


785 795-796 798 




804 823- 








824 


826 828 833 838 




845 847 








84 9 


857-860 867 874- 




878 88C- 








881 


887-888 890-892 




895 898 








908 


910-911 913-914 


Z7 £■ 4 


923 926- 








927 


929 932-934 937 


-7 _> 


941-942 








94 8 


953 957 961 963- 




966 978- 








979 


981-982 937 990 


992 


1001 








1004 


-1006 


1010 


1014 


1D20 


1024 








1033 


1038- 


-1039 


1044 


1047 


1050 








1052 


-1054 


1055 


1058 


1068 


1070- 


• 






1071 


1077- 


-1079 


1088 


1094 


-1097 








1105 


-1106 


1112- 


-1113 


1116 


-1117 








1124 


1126 


1128- 


•1129 


1131 


1134 








1136 


-1137 


1142- 


•1143 


1146 


-1147 








1149 


-1150 


1156 


1161- 


-1164 


1167 








1170 


-1173 


1177- 


•1181 


1190 


1192 








1197 


1200 


1204 


1208- 


•1209 


1214 








1217 


1219 


1222 


1230 


1232 


-1233 








1235 


1241 


1245 


1247 


1254 


1257- 








1258 


1260 


1262 


1271- 


-1273 


1283 








1286 


-1289 


1299 


1306 


1314 


1320 








1330 


-1332 


1334- 


1335 


1342 


1345 








1349 


1365- 


1367 


1370- 


1372 


1374 








1381 


1394 


1407 


1419 


1428 


1436- 








1437 


1440- 


1441 


1443 


1446 


-1449 


- 






1454 


1459 


1461- 


1462 


1468 


1470- 








1471 


1475 


1477 


1479 


1482 


1491 








1497 


-1498 


1504-1505 


1507 


1513 








1522 


1524- 


1526 


1528 


1531 


153 4 








1536 


-1537 


1548 


1550 


1553 


1555- 








1559 


1562 


1567 


1578 


1590 


-1591 








1597 


1599- 


1601 


1612 


1614 


1616 








1619 


-1620 


1622 


1624- 


1626 


1628 








1631 


-1632 


1634 


1636 


1639 


1644- 








1645 


1648 


1651 


1653- 


1656 


1658 








1660 


1662- 


1663 


1667 


1669 


1671 








1675 


1678- 


1681 


1683- 


1686 


1689 








1691 


-1692 


1703 


1709- 


1711 


1717 








1724 


-1726 


1729 


1734 


1737 


-1738 








1740 


1743- 


1744 


1749 


1753 


1759- 








1761 


1770 


1777 


1786 






crachea 


Clontech 


TRC001 


9 29 


-31 46 


48 8 


7 104 


107 


110 135 








158 222 2fi 


2 266 


286 


301 318 331 
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Tissue Origin 
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Library Name 








SEQ 


ID NOS: 










352 


372 


377 


384 


414 


424 


445-446 








454 


472 


4 74 


491 


496 


560 


S79 588 








£93 


597 


607 


612 


626 


681 


702 719 








810 


859 


866 


878 


894- 


895 


912 916 








S22 


932 


935 


104 


6 1075 1080 1099- 








1102 


1113 


1208 


1215 


1232 


-1233 








1237 


1261 


1312 


1385 


1387 


1405 








1414 


1424 


1430 


1437 


1447 


1505 








1S69 


1579 


1586 


1600 


1641 


1653 








1567 


1671 


1676- 


1677 


1683 


1691- 








1692 


1711 


1717 


1726 


1772 




uterus 


Clontech 


OTR0 01 


17 19 25 41 


46 


57-58 


( 61 


89 104 








108 


139 


152 


174 


198 


200- 


201 206 








263- 


265 


274 


290 


387 


408 


420 438 








44 6 


448 


452 


4 73 


491 


493 


499 S03 








506 


513 


519 


522 


526 


530 


S42-543 








560 


601 


610 


632 


659 


665 


720 751 








773 


780 


833 


845 


857 


872 


877 912 








929 


934 


937 


996 


1009 1011 1018 








1050 


1075 


1107 


1124 


1170 


1219 








1258 


1279 


1287 


1310 


1320 


1323 








1343 


-1344 


1375 


1437 


1451 


-1452 








1478 


1481 


1498 


1519 


1521 


1536 








1552 


1579 


1597 


1602 


1606 


1620 








1626 


-1627 


1649 


1652 


1661 


1670 








1719 


1722- 


-1723 
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TABLE 2 



PCT/USOO/34263 



SEQ 
ID 

NO: 


ACCESSION 
NUMBER 


SPECIES 

- 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


% 1 
IDENTITY 


1 


Y41736 


Homo 
sapiens 


Human PROH14 protein 
sequence . 


1398 


100 


2 


Y66656 


Homo 
sapiens 


Membrane -bound protein 
PR0943. 


2389 


99 


3 


AF113136 


Homo sapiens 


II»-1 receptor-associated- 
kinase- M; IRAK-M 


3043 


100 


4 


AF017806 


Mus mus cuius 


Zn-15 transcription factor 


6351 


77 


S 


X02761 


Homo sapiens 


fibronectin precursor 


1053S 


98 


6 


X02761 


Homo sapiens 


fibronectin precursor 


8990 


89 


B 


X02761 


Homo sapiens 


fibronectin precursor 


12564 


99 


9 


AJ011679 


Homo sapiens 


Rab6 GTPase activating 
protein, GAPCenA 


5251 


99 


10 


W88501 


Homo sapiens 


Human stomach carcinoma clone 
HP104 15 -encoded protein. 


2381 


100 


11 


AF117754 


Homo sapiens 


thyroid hormone receptor- 
associated protein complex 
component TRAP240 


11336 


98 


12 


Z97630 


Homo sapiens 


dJ4 66£J1.4 (novel protein 
similar to ANK3 (ankyrin 3, 
node of Ranvier (ankyrin 
G) ) ) 


896 


100 


13 


Y58620 


Homo s ap i ens 


Protein regulating gene 
expression PRGE-13 . 


1894 


98 


14 


AF213457 


Homo 
sapiens 


triggering receptor expressed 
on myeloid cells 2 


1238 


100 


Iff 


AF233453 


Homo sapiens 


RACK- like protein PRKCBP1 


3124 


99 


17 


AF2O1303 


Homo saoiens 


dta.fr oribeta w bindina protein 
RIP60 


3130 


98 


18 


AF064205 


Homo s aniens 


dvnactln l D150 isoform 


6377 


100 


19 


U00059 


Saccharomyce 
s cerevisiae 


Ynrl21wp 


174 


26 


20 


AB032903 


Homo sapiens 


i guanos ine monophosphate 
reductase isolocr 


1801 


99 


21 


AB032903 


Homo saoiens 


cruanosine ir.ononhosnhate 
reductase isolog 


1485 


99 


22 


AF140507 


Homo sapiens 


Ca2+/ calmodulin- dependent 
protein kinase kinase beta 


3083 


99 


23 


AF140507 


Homo sapiens 


Ca 2 + / calroodu 1 i n - d epend en t 
protein kinase kinase beta 


2300 


99 


24 


AJ289131 


Homo sapiens 


chondroitin 4-0- 
sulf otransf erase 


2211 


99 


25 


U33460 


Homo 
sapiens 


DMA-directed RNA polymerase 
I, largest subunit 


8777 


98 


26 


Y44488 


Homo sapiens 


ACRP30R2 variant protein. 


1387 


100 


27 


U43701 


Homo sapiens 


ribosomal protein L23a 


791 


100 


28 


U02032 


Homo sapiens 


ribosomal protein L23a 


767 


97 


29 


Y41324 


Homo sapiens 


Human secreted protein 
encoded by gene 17 clone 
HNFIY77. 


1083 


99 


30 


W71749 


Homo sapiens 


Human ubiquitin conjugation 
system protein 2 . 


715 


90 


31 


W71749 


Homo sapiens 


Human ubiquitin conjugation 
system protein 2 . 


631 


82 


32 


AF231917 


Homo sapiens 


long- chain 2 -hydroxy acid 
oxidase HAOX2 


1811 


100 


33 


Z29481 


Homo sapiens 


3-hydroxyanthranilic acid 
di oxygenase 


1507 


99 


34 


AB001451 


Homo sapiens 


Sck 


2869 


100 


35 


Y00644 


Homo sapiens 


precursor polypeptide (AA -34 
to 287) 


1667 


99 


36 


Y00644 


Homo sapiens 


precursor polypeptide {AA -34 
to 287) 


1104 


98 


37 


Y78795 


Homo sapiens 


Human antizuai-2 (AZ-2) amino 
acid sequence . 


3586 


78 


38 


Y78795 


Homo sapiens 


Human antizuai-2 (AZ-2) amino 
acid sequence . 


4726 


99 
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PCT/US00/34263 



SEQ 
ID 
NO : 


ACCESSION 
NUMBER 


SPECIES 




CMTTH- 
WATERMAN 

V 1 >afe ah ^a- aaaVaV^v aa» aai* 

SCORE 


IDENTITY 


39 


Y7B79b 


uotuo sapiens 


Httrnan anti2tiai-2 ( A2 - 2 ) amino 

acid sequence . 


3556 


77 


40 


U93121 


U cairi^ &rsc 

nomo oapiciio 




3747 


100 


41 


Y4 27S0 


Homo sapiens 


Human calcium binding protein 


795 


100 


42 








1189 


100 


43 


G02150 


Homo sapiens 


Human secreted protein, SEQ 
ID NO • 6231 . 


384 


94 


44 , 


UliJbi / 


rlua mui»i»UAii* 


El£-1 


2724 


88 


4S 


uxy oi / 


1 Y IU9 IHUoLUlUJ 




2062 


86 


46 


Ar 1UU /do 




n<?heoinductive factor OIF 


1538 


100 


47 


YB7591 


Homo sapiens 


Human SPROUTY-1 protein, SEQ 
ID N0:24. 


1737 


99 


49 


X04145 


1*" j-. «k ^BBa. 4h ^^fc apa* a*^ 

Homo sapiens 


160) 


942 


99 


51 


X63547 


Homo sapiens 


oncogene 


5845 


99 
_ 


52 


M94043 


Rattus 

• 

norvegicus 


rab- related GTP -binding 
protein 


1089 




53 


L317B3 


Mus musculus 


uridine kinase 


917 


71 


54 


X83973 


Homo sapiens 


transcription factor 


moo 




55 


AF224741 


Homo sapiens 


chloride channel protein 7 


4128 


99 


56 


W74805 


Homo sapiens 


Human secreted protein 
encoded by gene 77 clone 
HOEAS24 . 


14 y i 




57 


Z50907 


Homo sapiens 


Human TBC-1 cDNA from second 
transcript . 


4824 

. 


100 


58 


D79994 


Homo sapiens 


similar to ankyrin of 
Chroma tium vinosum. 


6089 


99 


59 


D79994 


Homo sapiens 


similar co ankyrin of 
Chroma tiura vinosum. 


a n i a 


" -1- 


60 


Y59738 


Homo sapiens 


Human normal ovarian tissue 
derived protein 15. 


bul 


i no 


61 


AB031069 


Homo sapiens 


protein containing exxc 
domain 1 


1 "3 OA 
13 ?U 




62 


Y66660 


Homo 

♦ 

sapiens 


Membrane -bound protein 
PR0783 . 


2492 


99 


63 


Y66660 


Homo 
sapiens 


Membrane -bound protein 
PR0783 . 


1709 


99 


64 


S70011 


Rattus sp. 


tricarboxylate carrier 


895 


55 


65 


AF139518 


Rattus 
norvegicus 


A-kinase anchor protein 


178 


24 


66 


W29666 


Homo sapiens 


Homo sapiens DHl30o_i clone 
secrecea protein. 


1 C7 
X i> / 


30 


67 


AJ24573 8 


Homo sapiens 


claudin-15 


1206 


100 


68 


AF09913 8 


Rattus 
norvegicus 


f*%Y X >*T VI — -aa. af -m ~l al\ a-aa jaa> ^a> k al aaa. 

G!jUT4 vesicle protein 


4 183 


87 


69 


AF099138 


Rattus 
norvegicus 


17JL»U14 vesicle protein 


4906 


86 


70 


Z82059 


— t -w-» \r* V~l **~^ ^ ^~ 

vJacIlOrilaDujL U 

is elegans 


canal protein comes from 
t*His aene 


1285 


44 


/l 


TV 'J A O *7 O 


-T. UllIU O Cl J. C_ J J CJ 


pMPPA'i r> *-o tr* i n 


1282 


100 


72 


AF126426 


Homo sapiens 


neurotrimin 


1809 


100 


73 


Y41652 


Homo 
i sapiens 


Human MEK2 protein sequence. 


2065 


99 


74 


Y41652 


J Homo 
sapiens 


Human MEK2 protein seguence. 


1207 


100 


75 


AF188622 


Mus musculus 


selectively expressed in 
embryonic epithelia protein- 1 


1485 


74 


76 


AE000406 


Escherichia 
coli 


putative DNA topoisomerase 


9S0 


100 


77 


X99302 


Homo sapiens 


Popl 


655 


100 


78 


AL136538 


Schizosaccha 

romyces 

pombe 


similarity to S. cerevisiae 
ktil2 protein 


210 


31 


79 


AP129756 


Homo sapiens 


G4 


1554 


99 
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SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


IDENTITY 


80 

m 


AL096768 


Homo sapiens 

- 


dJ858B16.2 
(phosphatidylserine 
decarboxylase (PSSC, EC 
4.1,1.65)) 


2033 


100 


81 


AI>096768 


Homo sapiens 


OJ858B16 . 2 
ipnospriat idyl serine 
decarboxylase (PSSC, EC 

*i . X . X . t>Z> ) ) 


1220 


96 


82 


XS7351 


Homo sapiens 


1-8D 


677 


98 


83 


AC005594 


Homo sapiens 


R26984__l 


2700 


98 


84 


X73113 


Homo sapiens 


fasc MyBP-C 


I 5959 


59 


85 


AF097330 


Homo sapiens 


HI chloride channel; p64Hl; 
CLIC4 


1305 


99 


86 


AB018423 


Mus musculus 


SH2 domain- containing protein 


1360 


78 


87 


AF272151 


Homo sapiens 


adaptor protein CIKS 


3084 


99 


88 


AF196329 


Homo 
sapiens 


triggering receptor expressed 
on monocytes 1 


1214 


100 


89 


AB016879 


Arabidopsis 
thai i ana 


contains similarity to pre- 
mRNA splicing 
factor-gene_id:MRBi7 . 2 


634 


36 


90 


AJ133721 


Mus musculus 


homeodomain protein 


654 


57 


91 


AJ242864 


Mus musculus 


phtf protein 


619 


61 


92 


A61971 


unidentified 


MCSP 


11676 


99 


93 


Y99365 


Homo sapiens 


Human PRO1250 (UNQ633) amino 
acid sequence SEQ ID NO: 86 . 


3890 


100 


94 


Y87231 


Homo sapiens 


Human signal peptide 
containing protein HSPP-8 
SEQ ID NO: 8. 


1031 


100 


95 


AF227741 


Rattus 
norvegicus 


protein icinase WNK1 


2428 


95 


96 


AP227741 


Rattus 
norvegicus 


protein kinase WNK1 


1961 


94 


97 


Y92513 


Homo sapiens 


Human OXRE-10. 


1626 


100 


98 


AL021366 


Homo sapiens 


CICK0721Q.3 (Kinesin related 
protein) 


3423 


100 


99 


AC00S733 


Homo sapiens 


R33083_l 


1974 


99 


100 


Y95293 


Homo sapiens 


Human GEF containing NEK- like 
kinase substrate sGNK. 


4092 


99 


101 


AL118501 


Homo sapiens 


dJH91N16.1 (A novel protein 
{translation of the cDNA 
DKFZp566A0946, Em: AI*050069) ) 


1509 


100 


102 


AJ006267 


Homo sapiens 


ClpX-like protein 


3233 


100 


103 


AF100753 


Homo sapiens 


ancient ubiquitous 46 kDa 
protein AUP1 


2042 


96 


104 


AB015982 


Homo sapiens 


serine/ threonine kinase 


4718 


100 


105 


AF151074 


Homo sapiens 


KSPC240 


831 


64 


106 


M35522 


Canis 
familiar is 


GTP- binding protein <rab7> 


354 


50 


10*7 


R99800 


Homo sap i ens 


NTII-1 nerve protein, 
facilitates regeneration of 
nerve cells. 


2337 


93 


108 


AF125533 


Homo sapiens 


NADH-cytochrome bS reductase 
isof orra 


1290 


93 


109 


AC005614 


Homo sapiens 


F23269_2 


3369 


99 


110 


AF064729 


Homo sapiens 


RAN binding protein 16 


3285 


100 


111 


XS2425 


Homo sapiens 


interleukin 4 receptor 


4496 


100 


112 


Y41686 


Homo 
sapiens 


Human PR0274 protein 
sequence . 


2285 


100 


113 


W15506 


Homo sapiens 


Mitogen activating protein 
kinase ERKl . 


1991 


100 


~114~ 


Y71071 


Homo sapiens 


Human membrane transport 
protein, MTRP-16. 


1190 


99 


115 


AL049548 


Homo sapiens 


dJ398G3.1 (ortholog of rat 
CPG2) 


3497 


99 


116 


AF189817 


Mus musculus 


evectin-2 


1124 


90 


117 


W30891 


Homo 


Human cytostatin III protein. 


715 


99 
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SEQ 
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ACCESSION 

xrrryrRET? 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


% 

IDENTITY 






sapiens 








118 


AF116618 


Homo sapiens 


PRO1038 


1469 


100 


119 


Y08915 


Homo sapiens 


alpha 4 protein 


1748 


100 


12C 


AP098070 


Dxosophila 
raelanogaster 


Li si homo log 


192 


39 




AF052^32 


Homo sapiens 


katanin p80 suimnit 


181 


37 


122 


Y70743 


Homo sapiens 


PSEQ-1 protein encoded by 
NSEQ gene associated with 
matrix remodelling. 


2637 


98 


„L _) 


AF083246 


Homo sapiens } HSPC028 


2132 


100 


124 


Y27096 


Homo sapiens 


Human viral receptor procein 
(ACVRP) . 


833 


99 


125 


M63109 


Irishman! a 
major 


glycoprotein 96-92 


172 


27 


126 


U75467 


Drosopnila 
melanogaster 


Atu 


935 


36 


127 


Z68220 


Ca enor habdi t 
is elegans 


Similarity to Human ADP/ATP 
carrier protein 


438 


"43 


12B 


AF095927 


Rattus 
norvegicus 


protein phosphatase 2C 


1927 


94 


x. ±* 


W929S8 

n J i. -/ —J v 


Homo sapiens 


Human zsig44 protein. 


463 


100 


130 


AF115391 


L,actobacillu 
s sakei 


ribokinaoe RbsK 


508 


37 


Til 

131 




Homo sapiens 


21 -Glutamic Acid-Rich Protein 


1250 


100 


132 


X93498 


Homo sapiens 


21-Glutamic Acid-Rich Protein 


916 


87 


133 


W52B11 


Homo sapiens 


Human DBl/ACBP -like protein 
(DBIH) . 


705 


97 


134 


Y84444 


Homo sapiens 


Amino acid sequence of a 
human RNA- associated 
protein. 


3230 


100 


13 5 


M6 9181 


Homo sapiens 


non- muscle myosin B 


189 


20 


136 




Homo sapiens 


Human secreted protein 
encoded by gene 154 clone 
HE6FL83 . 


480 


100 


1 J / 




Homo sapiens 


Human secreted protein 
encoded oy gene /=> cione 
HHGAU81. _j 


855 




-i. -> a 


AL033520 


Homo sapiens 


KIAA0701 protein) 


424 


39 


139 


AF020261 


oanvaXuui 

album 




119 


30 


140 

A W 


X70394 


houio s cip x ens 




1634 


100 


141 


Y06439 


Homo sapiens 


Human protease HUPM-8. 


936 


100 


142 


2684 93 


Ca enor habdi t 
is elegans 


predicted using Gene finder 


365 


42 


143 


AB018107 


Arabidopsis 
thaliana 


ADP-ribo3ylation factor- like 
protein 


596 


65 


144 


AF161483 


Homo sapiens 


HSPC134 


580 


51 


145 


YB4902 


Homo sapiens 


A. human proliferation and 
apoptosis related protein. 


480 


100 


14 6 


AB004906 


Ipomoea 
purpurea 


transposase 


146 


20 


147 


AC007357 


Arabidopsis 
thaliana 


P3F19.18 


647 


31 


14 8 


W75155 


Homo sapiens 


Human secreted protein 
encoded by gene 41 clone 
HNTME13. 


1494 


98 


143 


AF0S6490 


Homo sapiens 


cAMP - speci £i c 
phosphodiesterase 8A 


3710 


99 


150 


YS8171 


Homo 
sapiens 


Human hydrolase homologue 
HHH-7 . 


785 


99 


151 


010397 


Saccharomyce 
s cerevisiae 


Yhrl46wp 


515 


53 


152 


X73478 


Homo sapiens 


phosphotyrosyl phosphatase 
activator 


1719 


99 


153 


AL049697 


Homo sapiens 


dJ3 82H0.5.i (novel protein 


2034. 


99 
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SEQ 
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ACCESSION 
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SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


IDENTITY 








similar to arginyl - tRNA) 






154 


AF169802 


Homo sapiens 


cytochrome b5 reductase b5R.2 


1455 


99 


155 


X94703 


Homo sapiens 


rab28 


1126 


99 1 


156 


Y25716 


Homo sapiens 


Human secreted protein 
encoded from gene 6 . 


1471 


100 


158 


W77404 


Homo sapiens 


Secreted salivary polypeptide 
zsig32 . 


937 


100 


159 


Y17248 


Homo sapiens 


Human protein kinase 
inhibitor-2 (PKI-2) . 


383 


100 


160 


J04970 


Homo sapiens 


carboxypeptidase M precursor 


2395 


100 


161 


W54O40 


Homo sapiens 


Human interferon -inducible 
protein, HIFI. 


484 


98 


162 


AL022724 


Homo sapiens 


dJ4l3H6.i.l (hamster 
Androgen- dependent Expressed 
Protein LIKE putative 
protein) (isoform 1) 


1357 


100 


163 


AF125535 


Homo sapiens 


pp21 homolog 


193 


45 


164 


G03632 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 7713- 


4 63 


97 


165 


AJ250839 


Homo sapiens 


serine/ threonine protein 
kinase 


1442 


71 


166 


L09649 


Zymomonas 
mobilis 


zm2 


173 


37 


167 


Y73337 


Homo sapiens 


HTRM clone 1944530 protein 
sequence. 


1204 


100 


168 


W86645 


Homo sapiens 


Secreted protein encoded by 
gene 112 clone HUKFC71 . 


1084 


100 


169 


AF214731 


Homo sapiens 


ATP-dependent RNA helicase 


4402 


100 


170 


AE000871 


Me thanobacte 
rium 

thermoautotr 
ophicum 


conserved protein 


166 


27 


171 


Y27684 


Homo sapiens 


Human secreted protein 

encoded by gene No. 118. 


821 


100 


172 


AF226044 


Homo sapiens 


HSNFRK 


2904 


100 


173 


AJ245946 


Homo sapiens 


neuroglobin 


779 


100 


174 


D43949 


Homo sapiens 


This gene is novel . 


3202 


100 


175 


Y07923 


Homo sapiens. 


GTP-binding protein 


1205 


100 


176 


W90338 


Homo 
sapiens 


Human DPI homologue protein. 


966 


100 


177 


Y41675 


Homo sapiens 


Human channel- related 
molecule HCRM-3 . 


1122 


100 


178 


Y416 74 


Homo sapiens 


Human channel -related 
molecule HCRM-2 . 


936 


99 


179 


AF220492 


Homo sapiens 


krueppel-like zinc finger 
protein HZF2 


4100 


99 


180 


X03084 


Homo sapiens 1 


Clq B- chain precursor 


1240 


100 


181 


U57344 


Mus musculus 


Meis3 


1813 


89 


183 


U57344 


Mus musculus ; 


Meis3 


1743 


86 


184 


U57344 


Mus musculus 


Me"is"3 


1070 


86 


185 


AF033120 


Homo sapiens ; 


p53 regulated PA26-T2 nuclear 
protein 


1389 


58 


186 


AF200357 


Mus musculus 


pantothenate kinase 1 beta 


1605 


82 


187 


W75058 


Homo sapiens 


Human secreted protein 
encoded by gene 2 clone 
HLDBG33 . 


1188 


99 


188 


AJ292529 


Homo sapiens 


suppressor of sterile four 1 


2424 


100 


190 


X54134 


Homo sapiens 


protein- tyrosine phosphatase 


3705 


100 


191 


Y22203 


Homo sapiens 


Human calcium-binding 
phdsphoprotein, CBPP-i, 
protein sequence . 


1083 


99 


192 


W63692 


Homo 
sapiens 


Human secreted protein 12- 


1975 


100 


193 


W87772 


Homo sapiens 


Human serum glucocorticoid- 
regulated kinase (H-SGK2) 
polypeptide . 


2605 


99 
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TABLE 2 



SEQ 
NO: 


ACCESSION 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


% 

IDENTITY 




AF0R42S9 


Hus mus cuius 


bromodoma in - conta in ing 
protein BP75 


"693 


54 


19S 


Y00752 


Rattus 
norvegicus 


serine dehydratase {PA 1 - 
327) 


994 


61 


19S 


W95349 


Homo sapiens 


Human foetal brain secreted 
protein fhl70_7. 


2596 


100 


197 


AB028859 


Homo sapiens 


hDj9 


1890 


100 


198 


W95633 


Homo sapiens 


Homo sapiens secreted protein 
gene clone hm236__l . 


1614 


100 


199 


Y44277 


Homo 
sapiens 


Human nucleic acid methylase- 
2. 


2096 


99 


200 


AB030039 


Homo sapiens 


hPACPLl 


2258 


100 


201 


X54162 


Homo sapiens 


64 Kd autoantigen 


2918 


99 




G02061 

\J U i. u u 


Homo saoiens 


Human secreted protein, SEQ 
ID NO: 6142. 


558 


99 


TAT 


Y1 1QQC 


Nicotiana 
taba cum 


extensin (AA 1-620) 


185 


33 


n d 

Z. VJ 




Bos taums 


32 kd accessory protein 


1837 


100 


205 


J04204 


Bos taurus 


32 kd accessory protein 


1101 


100 








Human sicmal nentide 
containing protein HSPP-60 
SEO ID NO: 60 . 


1318 


100 


or* a 


1 U400U 


Homo saoiens 


Pracrment of human secreted 
protein encoded by gene 65 . 


936 


98 


*> T\ Q 


BT.l TIQQO 
-M.XJ-L / iOOJ 




<U1076E17.1 IKXAA0823 protein 
(continues in AL023 803) ) 


694 


54 




AX i £D f J & 


Homo saoiens 


NPB007 


1345 


76 


211 


X66295 


Mus rausculus 


Ciq C chain 


970 


73 


212 


Z29328 


Homo sapiens 


Ubiquitin-con^ugating enzyme 


966 


100 


213 


Z29328 


Homo sapiens 


Ubigu it in -conjugating enzyme 

Ufc>cW2 


542 


98 


214 


AJ002030 


Homo sapiens 


progresterone binding protein 


1163 


100 




Y?nc/i a 
A / UO*^ 


nouicj aapxciio 


mpmhpr of DEAD box Droteln 
family 


3933 


100 


"216 


AF2505S8 


Homo sapiens 


claudin-2 


1169 


99 


217 




xiomtj odpiciia 


H.Tfi2iDii i (PUTATIVE i> rote in) 


259 


100 


218 


Y08565 


Homo sapiens 


UDP-GalNAc:polypeptide N- 

acetylgalactosaminyltransfera 

se 


3331 


99 


dm JL 2* 


I -/ 4 *i J A. 


Homo saoiena 


Human inflammation associated 
protein 


2067 


100 


220 


AL035521 


Araoidonsis 
thai i ana 


putative protein 


315 


42 


221 


AL031786 


Scaizosaccha 

romyces 

pombe 


putative proline- trna 
synthetase 


811 


41 


222 

• 


AL109736 


Schizosaccha 
romyces 

m* 

pombe 


WD repeat protein 

mw» +- 


626 


40 


223 


X52493 


Glycine max 


DNA-directed RNA polymerase 


136 


23 


224 


AL0356S9 


Homo sapiens 


dOT979Nl.l (dJ979Nl.l) 


5199 


98 


225 


AB032401 


Mus musculus 


mmDj4 


1761 


92 


226 


AB0324O1 


Mus musculus 


mmDj4 


1988 


92 


227 


X83502 


Saccharomyce 
s cerevisiae 


J1007 


112 


26 


228 


X83S02 


Sac char omyc e 
s cerevisiae 


J1007 


79 


25 


229 


AF143723 


Homo sapiens 


heat shock protein HSP60 


2557 


99 


230 


Y66677 


Homo 
sapiens 


Membrane- bound protein 
1 PR0828. 


982 


100 


231 


AB027466 


Homo sapiens 


spondin 2 


1756 


99 


- 

232 


W9S634 


Homo 
sapiens 


Homo sapiens secreted 
protein. 


1391 


100 


233 


K00365 


Homo sapiens 


Human cyciin Bl - 


2218 


99 


234 


Y53762 


Homo sapiens 


A GTP-binding polypeptide 


1017 


100 
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TABLE 2 



SBO 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


% 

IDENTITY 








designated RAQ . 






235 


Z50749 


Homo sapiens 


yeast sds22 homo log 


1800 


100 


| 236 


Z50749 


Homo sapiens 


yeast sds22 homo log 


1754 


98 


237 


AB026491 


Homo sapiens 


PICK1 


2137 


100 


238 

i 


AJ270205 


Entodinium 
cauda turn 


putative 

pho spha t i dy 1 inos i t ol - 4 - 
phosphate 5-kinase 


114 


37 


239 


AB030189 


Mus muscralus 


contains transmembrane (TM) 
region and ATP binding region 


710 


93 


240 


W56538 


Homo sapiens 


Human hedgehog interacting 
protein (HIP) . 


3785 


99 


241 


W56538 


Homo sapiens 


Human hedgehog interacting 
protein (HIP) . 


3436 


99 


242 


AF155107 


Homo sapiens 


NY-REN-37 antigen 


996 "1 


99 


243 


AF155107 


Homo sapiens 


NY-REN-37 antigen J 1005 


100 


244 


AL.031320 


Homo sapiens 


dJ20N2.1 (novel protein 
siiHiiai co yeasL ana 
bacterial cytosine 
deaminase ) 


| 763 

1 
■ 


99 


245 


U37026 


Rat tus 
norvegicus 


soQxum cnannei oeta 2 suduiuc 


1 c o 


30 


24 6 


AI.078599 


Homo sapiens 


dJ991C6.1 (novel protein 
siuiiiar to u. eiegans 
FS5A12.9 (Tr:P91O06)) 


2391 


98 


247 


U32274 


Saccharomyce 
s cerevisiae 


Ydr386wp; CAI: 0 .12 


191 


37 


248 


Y41719 


Homo 
sapiens 


Human PK(J8b4 protein 
sequence * 


-1 O T Q 


100 


249 


AB029434 


Homo sapiens 




DJ.1 


100 


250 


X97831 


Rattus 
norvegicus 


camibine/ aCyjlCaluluint: 

carrier protein 


... ... 


38 


251 


W80993 


Homo 
sapiens 


RIF . 


-L. / ^ ""1 


100 


252 


Y94873 


Homo 
sapiens 


Human orotein clone HP02632 - 


1876 


100 


253 


W59878 


Homo sapiens 


Amino acid sequence of the 
cDNA clone AIF-2 (HEBGM4 9) . 


765 


100 


254 


AL354533 


Leishmania 
maj or 


possible adenylate kinase 


265 


34 


255 


AF233322 


Mus mus cuius 


zinc transporter like 2 


1916 


95 


2S6 


Y78113 


Homo sapiens 


Human cytokine signal 
regulator CKSR-1 SEQ ID 
NO:l. 


2247 


99 


257 


AL035539 


Arabidopsis 
thaliana 


putative amino acid transport 
protein 


350 


27 


258 


W74787 


Homo sapiens 


Human secreted protein 
encoded by gene 58 clone 
HHFHN61. 


1171 


ioo 


259 


AL035689 


Homo sapiens 


dJ18 7Jll.l (novel protein 
similar to protein kinase C 
inhibitors) 


974 


100 


260 


AE000909 


Methanobacte • 
rium 

the rmoau t ot r 
ophicum 


serine/threonine protein 
kinase related protein 


363 


30 


261 


AL050131 


Homo sapiens 


hypothetical protein 


626 


100 


262 


AF019661 


Mus musculus 


zeta proteasome chain; PSMA5 


1214 


100 


263 


AL035593 


Homo sapiens 


dJ310J6.1 (novel protein) 


821 


100 


264 


AL022318 


Homo sapiens 


T>K1S0C2.3 (PUTATIVE novel 
protein similar to APOBECl) 


1072 


100 


265 


AP205940 


Homo sapiens 


endomucin 


1289 


100 


266 


AL023583 


Homo sapiens 


dJ500Ll4.1 (novel proteinj 


789 


100 


267 


AJL034548 


Homo sapiens 


dJH03G7.3 (novel protein 
kinase domains containing 
protein similar to 
phosphoprotein C8FW) 


1888 


99 
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1 cpo 
{ oty 

ID 

NO: 


NUMBER 






ijn J. in — 
WATERMAN 
SCORE 


5- 

IDENTITY 


26 8 


AF161470 


Homo sapiens 


HSPC121 


1884 


98 


269 


AF161470 


Homo sapiens 


HSPC121 


1232 


96 


a / »/ 


X90763 


Homo 
sapiens 


HHa5 hair Jr r~?* r S n r vnp T 
intermediate filament 


2190 




271 


AF207600 


Homo sapiens 


etbanolatnine kinase 


1952 


100 


272 


M32334 


Homo sapiens 


intercellular adhesion 
molecule 2 


1436 


100 


273 


AF161483 


Homo s aniens 


HSPC134 


W \J —J 


O -X. 


274 


Y53052 


Homo sapiens 


Human secreted protein clone 
df2 02_3 protein sequence SEQ 
ID NO • 11 O 


587 


100 


276 


Y77S76 


Homo sapiens 


Human cytoskeletal protein 

XII \ \— J. KJL It- Jl 10 ^ • 


762 


100 


277 


AF077042 


Homo sapiens 


3 OS ribosomal protein S7 
nomoi oy 


1269 


100 


^ / 0 




nQulO odplcnS> 


Huniau secreceo prouein cione 

caiuo x ja pi ULClu sequence 




Q Q 

y a 


27 9 






human phosphorylation 
effector PHSP-20. 


2 ft m 


>QQ 


oon 
O *-* 






rou v. ^ anbuucin 


X oJL o 




281 


Z7 513 4 

X.J * 


£ amil iairis 




1 71 ft 




282 


AF249873 


Homo sapiens 


muscle-specific protein 


1395 


100 J 






liW^i'W «9 O^^^ did 




*± w 7 


98 j 


284 




nim*i\ \\j & ct l/ jl nj i ^3 


DOT 


1 ft*^*J 


q Q 


285 


AF15610? 




null > cuiupi.C?A Ci/iC aVUJUUAL 


1 "^1 8 


99 * 


286 


Y3S897 


Homo sapiens 


Extended human secreted 
146. 


1250 


99 | 


287 










X U w 


288 


AT. 0^0143 








i hn 


289 


AJ0110S8 


Homo sapiens 


telethonin 


574 


100 


^ ~ V 


ZOO / <i 1 


nOulU 

sapiens 


KeiHDrane" , Dounu protein 
PROS36. 






^ J A 


ATTm AAOI 
■rvr ujvoui 


nuiitu ti ct^j icuo 


j- iprj.n~ ci ipna f» 






292 

A* 


-/VT V — J Jt C_* U _L 






^ O v/ 


i nn 

xuu 


293 


AL049851 


Homo sapiens 


dJ889J22B.l (novel protein 

f i co in 1 \ \ 


1738 


100 


294 


Y73348 


Homo sapiens 


HTRM clone 839651 protein 


1245 


99 


295 


L11672 


Homo sapiens 


zinc finger protein 


1694 


44 


296 


AL03 5423 


Homo saoiens 


dJ2 0I3 L (brain mitochondrial 
carrier protein-1 (BMCPl) ) 


1024 


79 


297 


AP19B532 


Homo sapiens 


lymphoid enhancer binding 
factor- 1 


2173 


100 


298 


AF161417 


Homo sapiens 


HSPC299 


1147 


85 


299 


AF159141 


Homo sapiens 


breast cancer metastasis- 
suppressor- 1 


1236 


99 


30D 


U26397 1 


Rat bus 


inositol DolvohocDhate 4- 
phosphatase 


160 


30 


301 


AP036145 


Homo sapiens 


meningioma- expressed antigen 
5 


34S8 


100 


302 


Z82022 


Homo sapiens 


GlcNac-l-P transferase 


2067 


99 


303 


AP269232 


Mus inueculus 


butyrophilin-like protein 
BUTR-1 


271 


50 


304 


AJ222644 


Arabi dops i s 
thai i ana 


asparaginyl-tRNA. synthetase 


659 


50 


305 


AF05418O 


Homo 
sapiens 


bematopoietic cell derived 
zinc finger protein 


351 


79 


306 


AJ272079 


Homo sapiens 


APOBEC-1 stimulating protein 


3056 


100 


308 


Y44486 


Homo 
sapiens 


Human GPRW receptor 
polypeptide . 


1721 


100 


309 


AJ131891 


Homo sapiens 


DNA polymerase rau 


2598 


100 
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TABLE 2 



SEQ 
ID 

NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


IDENTITY 


310 


AF293335 


Homo sapiens 


P30 DBC 


1248 


92 


311 


AF176525 


Mus musculus 


P-box protein FBI»12 


1501 


93 


312 


X57802 


Homo sapxens 


immunoglobulin lambda light 
chain 


959 - 


81 


313 ! 


Z3671S 


Homo sapiens 


Net 


2048 


98 


314 


AF161532 


Homo sapiens 


HSPC047 


727 


100 


315 


AF208068 


Homo sapiens 


kelch-like protein KLHL3a 


3046 


100 


316 


Y66666 


Homo 
sapiens 


Membrane-bound protein 
PRO1013. 


1166 


100 


317 


Y29666 


Homo sapiens 


Human Ras protein RAPR-i . 


1253 


98 


318 


AJ387747 


Homo sapiens 


sialin 


2614 


99 


319 


AF161362 


Homo sapiens 


HSPC099 


224 


40 


320 


Y68773 


Homo sapiens 


Amino acid sequence of a 
human phosphorylation 
effector PHSP-5. 


2243 


99 


■J & A 


AJ238379 


Homo saoiens 


putative THl protein 


3013 


100 


322 


AB040812 


Homo sapiens 


protein kinase PAK5 ! 


3792 


99 


"> o i 
3 J. j 


vo cm i 
Xiri> VJL J 




Ut imp rj qprrp t~ f*cl nrotPl n 

VC48_1, SEQ ID NO: 66. 


913 


100 


324 


Y13381 


Homo sapiens 


Amino acid sequence of 

Y\~r~r^ t~ pin DI?0"7*71 
ul.(JLclii trC\\J4 / x « 


1976 


100 


325 


Y94 944 


Homo sapiens 


TJi yi c* o r~" >^£a ^ tf^ T^>*0 ^ ^ S "O 1 f%T1 

bf 157^16 protein sequence 
SEQ ID NO: 94 . 


2305 


98 


326 


Y76884 


Homo sapiens 


Retinoblastoma binding 
protein- 7 sequence . 


^6728 


99 


327 


AF198532 


Homo sapiens 


lymphoid enhancer binding 
f actor-1 


2173 


100 


328 


Z78013 


Caenox3iabd.it 
is elegans 


Similarity to Drosophila 
Cadherin- related tumor 
suppressor 


569 


33 


329 


AF212921 


Mus musculus 


MMTV receptor variant 1 


4 o4 


Q A 


330 

■ 


Z75330 


Homo 
sapxens J 
>R6S207 
R65207 02- 
MAR- 1995 27- 
AUG- 1993 
Human 

stromalin-1. 

I nuiuu 
sapiens 


nuclear protein SA-1 


6492 






TV TO rt P Rill 




dJ327J!6.3 (supported by 
GENSCAN, FGENES and GENEWISE) 


2133 


99 




X ^ oiut 


M-nmo ■Q - 3T>"tf»Tl^ 


Extended human secreted 
protein sequence, SEQ ID NO. 
489. 


310 


41 


333 


AJ271669 


Homo sapiens 


putative sialoglycoprotease 


1747 


100 


334 


AF156598 


Mus musculus 


p53 -regulated DDA3 


997 


64 


J J 3 




Eimeria 
maxima 


emlOO gene is homologous the ? 154 
Eimeria tenella gene etlOO j 


26 | 




VftR^fid 
x o jjoi 


Homo cani ptir 


Human horaologue of UNC-"53 
(Hs-UNC-53/l) sequence. 


3386 


97 








Human homologue of UNC-53 
(Hs-UNC-53/1) sequence. 


2602 


94 

i 


338 


Y85564 


Homo sapiens 


Human homologue of UNC-53 
(Hs-UNC-53/1) sequence. 


3447 


j 98 


339 


Z66561 


Caenorhabdi t 
is elegans 


Similarity to Human rabl3 
protein {P1R Acc . No. 
A49647) . 


716 


34 


340 


AB021643 


Homo 
sapiens 


gonadotropin inducible 
transcription repressor- 3 


2761 


99 


341 


G01946 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 6027. 


465 


98 


342 


AP020591 


Homo sapiens 


zinc finger protein 


1091 


48 


343 


1.2 9154 


Homo sapiens 


immunoglobulin heavy chain 


1 439 • 


84 j 
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SEQ 
ID 

NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


% 

IDENTITY 








v jlaj regiou 






344 


U10281 


sus scrota 


gasciric mucin 






345 


AKOO04O4 


Homo sapiens 


umiameci protein product 


11// 




346 


Jb22557 


Rattus 
norvcy x ciis 


caiinoaui in ~ oinai nc[ prone in 


1 Q A O 


P A 


J47 




KaLLUS 

1 1 ■ V CM J» V"* lf> 


( — QXlUvUUJ .1 liUXUUXU^J pLULCUl 






J ** o 


AIjQ4 94 81 


Arabidonfii s 
t ha liana 


AIGl-lik.e d rote in 


316 


30 


350 


AJ251516 


Mus musculus 


cysteine and histidine-rich 

J> U LC 111 


1460 


99 


351 


AK024477 


Homo sapiens 


FliJ00070 protein 


1773 


100 










3U« 


-j 
j j 


O C 1 




xiomo SapLcns 


uxxnameQ protein product 




1 u u 


354 


AP161420 


Homo sapiens 


HSPC302 


2623 


97 


3S5 


AJ010014 


Homo sapiens 


M96A protein 


1269 


47 


35S 


AF151029 


Homo sapiens 


HSPC195 


941 


91 


357 


AL022327 


Homo sapiens 


dJ355C18.1 (KIAA0027) 


1911 


100 


358 


VJ78128 


Homo sapiens 


Human secreted protein 
encoded by gene 3 clone 
HOSBI96 . 


1117 


100 


359 


X03414 


Drosophila 
melanogaster 


Kr polypeptide 


316 


45 


360 


AF1S1079 


Homo sapiens 


HSPC245 


643 


10O 


361 


Y53886 


Homo sapiens 


A suppressor of cytokine 
signalling protein 
designated HSCOP-6. 


530 


41 


362 


AF254741 


Drosophila 
nelsmogaster 


Centaur in Gamma 1A 


661 


46 


363 


AP21346S 


Homo sapiens 


dual oxidase 


2016 


100 


364 


AF181562 


Homo sapiens 


proSAAS 


1319 


100 


365 


AF181562 


Homo sapiens 


proSAAS 


1024 


99 


366 


U73200 


Mus musculus 


pll6Rip 


8 84 


82 


367 


AF263744 


Homo sapiens 


erbb2- interacting protein 
ERBIN 


4973 


99 


368 


U37501 


Mus musculus 


laminin alpha 5 chain 


5867 


72 


369 


AF043695 


Ca enorhabdit 
is elegans 


similar to the protein 
phosphates 2c family 


549 


36 


370 


Y73440 


Homo sapiens 


Human secreted protein clone 
yj23_l protein sequence SEQ 
ID NO: 102. 


1484 


99 


3 71 


AF2 72833 


Homo sapiens 


misato 2869 


97 


372 


AF198454 


Homo sapiens 


epithelial protein lost in 
neoplasm beta 


3 927 


10O 


■a T> 


17334b 


Homo sapiens 


HTRM clone 438283 protein 
sequence . 


273 




J / 4 


At J. b y Ul / 


riomo sapiens 


f ormi mi no t ran s f er a se 
cycl odeaminase 


2717 








viniQBncii lea 


RED ALPHA 


1202 




376 


W74828 


Homo sapiens 


Human secreted protein 
encoded by gene 100 clone 
HLQA352. 


1012 


99 


3 f i 


* j^JIJ J. 


rioiuo sap i ens 


Human LYST-2 protein. 


3556 


Q Q 


•5 / O 


mi /ion 


Hotno sapiens 


pol 


132 


ft ^ 


379 


AF090934 


Homo sapiens 


PRO0518 


382 


100 


J o w 


AO O J OJ 


noKo Sapiens 


serine/threonine protein 
kinase 


2499 ! 


100 

«L_ W W* 


381 


Y41699 


Homo 
sapiens 


Human PRO703 protein 
sequence . 


2362 


100 


382 


AF174498 


Homo sapiens 


GR AF-l specific protein 
phosphatase 


7008 


98 


383 


U64608 


Caenorhabdi t 
is elegans 


coded for by C. elegans cDNA 
ykl73cl2.5 


246 


36 j 


3 84 


US0133 


Homo sapiens 


anJcyrin 


502 


33 


385 


AJ238520 


Homo sapiens 


putative transcription 
factor- like nuclear regulator 


4123 


97 



152 
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TABLE 2 



ID 

NO: 


ACCESS TOM 
NUMBER 






WATERMAN 
SCORE 


km 

IDENTITY 


387 


AF208845 


Homo saDiens 


BM-003 


1375 


qq 


389 


X57B21 


Homo sapiens 


immunoglobulin lambda light 
chain 


797 


76 


390 


AF182404 


Homo saoiens 


nd tochondrial uncounlina 
protein 1 


1670 




391 


Y85564 


Homo sapiens 


Human homolocrue of UNC-53 
(Hs-UNC-53/1 ) sequence. 


3386 


97 


393 


AF178432 


Homo sapiens 


SH3 protein 


3700 


100 

-m\m W W 


394 


AF229928 


Drosophila 
melanogaster 


cytoplasmic protein 89BC 


1616 


62 


395 


AF181721 


Homo saoiens 


RU2S 


2254 


i no 


396 


Y69197 


Homo sapiens 


Amino acid sequence of a 
human beta IV- spec t rin 


1626 


98 


3 97 


048238 




<7 -i Y\f f S nnPT' nrnhoi r> nonrA. rl4 




D U 


398 


AL390137 


Homo sapiens 


hypothetical protein 


263 


51 


399 


AF217525 


Homo sapiens 


Down syndrome cell adhesion 
molecule 


5337 


60 


400 


AI>022599 


Schizosaccha 

romyces 

pombe 


WD repeat protein 


447 


27 


401 


AC004 859 


Homo sapiens 


similar to 2-oxoglutarate 
dehydrogenase ; similar to 
Q02218 (PID:gl352618) 


4176 


78 


402 


AB010266 


Mus rausculus 


tenascin-X 


10246 


62 


403 


AL133288 


Homo sapiens 


dJ671D7.1 (similar to 
D. melanogaster CG5986 
protein) 


761 . 


100 


404 


Z68753 


Caenorhabd 1 t 
is eiegans 


ZC518.3b 


f> f»l o 

888 


48 


A f\ C 

40b 


Z7 8013 


Caenornaodit 

IS elcyauS 


similarity co orosopiuia 
LEunerin-i-cxaLcQ Luinor 
suppressor 






406 


AB031230 


Homo sapiens 


protein containing CXXC 

uOluain z 


1196 


97 


A Fit 


AT 1331UD 


riouio sapiens 


iM x tveir* .} o anuigen 






408 


Y57945 


Homo sapiens 


Human transmembrane protein 
HTMPN-69. 


1538 


99 




6IOJ O X 


yvj, s aries 


cricnonyai in 


1 R4 




410 


AF249744 


Homo sapiens 


RhoGEF 


2733 


100 


411 


Arl /63£3 


mus mus cuius 


r -dox procein idaij 


& U / A 




412 


AF210842 


Homo sapiens 


HARP 


4880 


100 


A 1 ~3 


7\t r\*a t ^ c o 
ALUJlobo 


Homo sapiens 


QjiiooiJ . / (novel procein 

3 1 


/ / O 




414 


X57398 


Homo sapiens 


pm5 protein 


6131 


99 


415 






carboxylase biotin-containing 
Buhuntt 


2961 


99 


416 


U43503 


S a c charomy c e 


Lphlp 


115 


42 


417 


AIV160493 


maior 


Dossible t26f 17 21 


239 


35 


"418 


Y08100 


Homo saciens 


Human PRO 3 3 1 rurotf^in * 


330 


29 


419 


U15131 


Homo sapiens 


P 126 


2228 


54 


420 


AF117946 


Homo saoiens 


Link, cruanine nucleotide 
exchange factor II 


2363 


100 


421 


API 906 3 5 


Drosophila 
melanogaster 


ankyrin 2 


755 


30 


422 


AF302150 


Homo 
sapiens 


phosphoinositol 3 -phosphate- 
binding protein- 2 


1962 


100 


423 


AL13753 0 


Homo sapiens 


hypothetical protein 


433 


94 


424 


X63753 


Homo sapiens 


son- a 


7269 


100 


425 


AB027249 


Homo sapiens 


MAPKK like protein kinase 


1693 


100 


426 


AF279144 


Homo sapiens 


tumor endothelial marker 7 
precursor | 


1084 


55 



153 
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TABLE 2 



SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


i 

IDENTITY 


427 


AF279144 


Homo sapiens 


tumor endothelial marker 7 
precursor 


1259 


56 


426 


AE003683 


Drosophila 
melanogaster 


CG8312 gene product 


149 


29 


429 


Y07829 


Homo sapiens 


RING finger protein 


2201 


99 


430 


AF096897 


Drosophila 
melanogaster 


pushover 


4442 


47 


431 


U41387 


Homo sapiens 


Gu pro te din 


4021 


99 


432 


AF023674 


Homo sapiens 


nephrocys tin 


3783 


100 


433 


AF146760 


Homo 
sapiens 


sept in 2 -like cell division 
control protein 


2284 


100 


434 


AB006697 


Arabidopsis 
thai i ana 


cleft lip and palate 
associated transmembrane 
protein-like 


886 


42 


437 


Y94247 


Homo sapiens 


Human calcium binding protexn 
hCBP. 


1704 


100 


438 


AB040672 


Homo eapxens 


UDP-GalNAc: polypeptide N- 
acetylgalactosaminyl trans f era 
se 


1075 


63 


439 


AF105228 


Bos taurus 


tuftelin 


285 


33 


440 


R06463 


Homo sapiens 


Derived protein of clone 
ICA13 (ATCC 40S53) . 


3073 


95 


441 


X14971 


Mus musculus 


alpha-adaptin (A) <AA 1-977; 


4897 


98 


442 


X53773 


Rattus 
norvegicus 


alpha- c large chain (AA 1- 
938) 


3979 


81 


443 


Y66689 


Homo 
sapiens 


Membrane-bound protein 
PROH36 . 


3299 


99 


444 


AC067754 


Arabidopsis 
thai iana 


unknown protein; 20348-23707 


114 


33 


445 


AF229032 


Mus musculus 


pili 


2077 


93 


446 


AF056035 


Rattus 
norvegicus 


s-nexilin 


2662 


85 


447 


AF132484 


Mus musculus 


unknown 


478 


51 


448 


K89024 


Homo sapiens 


Polypeptide fragment encoded 
by gene 156 . 


528 


45 


449 


AF161445 


Homo sapiens 


HSPC327 


1606 


100 


450 


Z68753 


Ca enor habdi t 
is elegans 


ZC518 .3b 


951 


49 


451 


W39160 


Homo sapiens 


Human partial complement 
factor H protein fragment 3 . 


155 


32 


452 


W8S727 


Homo 
sapiens 


Novel protein (clone 
BM46 10) . 


2799 


99 


453 


Y53629 


Homo sapiens 


A bone marrow secreted 
protein designated BMS115. 


2810 


100 


454 


D87438 


Homo 
sapiens 


Similar to a C. elegans 
protein in cosmid C14H10 


4069 


100 


455 


AF240468 


Homo sapiens 


nicastrin 


3687 


100 


456 


215005 


Homo sapiens 


CENP-E 


13305 


99 


457 


M59216 


Homo 
sapiens 


gamma- aminobutyric acid 
receptor beta-1 subunit 


2477 


100 


458 


Y73467 


Homo sapiens 


Human secreted protein clone 
yd61_l protein sequence SEQ 
ID NO: 156 . 


966 


100 


459 


W67824 


Homo sapiens 


Human secreted protein 
encoded by gene IB clone 

WOT PMO Q 
rlO LtT l*)^ _7 . 


535 


100 


460 


AF163151 


Homo sapiens 


dentin sialophosphoprotein 
precursor 


279 


19 


461 


D87446 


Homo sapiens 


Similar to a C. elegans 
protein encoded in cosmid 
C27F2 (U40419) 


9196 


99 


462 


G04044 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 8125. 


486 


93 


463 


AC002398 


Homo sapiens 


. F25965_l 


1018 


100 


464 


AF064856 


Rattus sp . 


7acomp protein 


1845 


84 


465 


AF2234 0 8 


Homo sapiens 


B99 


3686 


99 



154 
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SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


1 SMITH - 
J WATERMAN 
j SCORE 


• 

IDENTITY 


466 


AF223408 


Homo sapiens 


B99 


\ 2878 


B7 


467 


AF104415 


Mus tnusculus 


gene trap locus -13 


j 6336 


91 


468 


U53450 


Rattus 
norvegicus 


Jun dimerization procein 1 
JDP-1 


I 196 


49 


469 


AL031297 


Homo sapiens 


dJ97P20.1 (novel gene) 


j 3564 


99 


470 


AF2S7077 


Homo sapiens 


euJcaryotic translation 
initiation factor EIF2B 
subunit 3 


1274 


95 


4 71 


Ii2B12S 


Pod-ospora 
anserina 


beta transducin-like protein 


284 


38 


4 72 


Y84903 


Homo sapiens 


A human proliferation and 
apoptosis related protein. 


2337 


100 


473 


AP144237 


Homo sapiens 


LOMP protein 


1 252 


44 


474 


Y71213 


Homo sapiens 


Human irritable bowel disease 
related polypeptide IMX3 9. 


1 838 


100 

mm W W 


475 


Y95006 


Homo sapiens 


Human secreted protein 
ve!3_l, SEQ ID NO: 52. 


3411 


100 


476 


D38S49 


Homo sapiens 


hal025 is new 


6533 


99 


i 477 


AF241230 


Homo sapiens 


TAKl-bindina Drotein 2 

» ^mw mm m* m m. -m> m m Sj* - mm m m mmw 


1 3656 


100 

mm V V 


478 


AL031534 


Schi zosaccha 

romyces 

pombe 


putative asoaraoine svnthasp. 


i 482 

1 W mm 


4 0 


4 79 


L28125 


Podospora 
anserina 


beta transducin-like protein 


233 


26 


480 


AF161544 


Homo sapiens 


HSPC059 


434 


77 


4 81 


AJ238248 


Homo sapiens 


centaurin beta2 


3986 


99 


482 


Z3 8061 


Saccharomyce 
s cerevisiae 


mal5. stal . len* 1367 CAI • 
0.3, AMYH_Y EAST P08640 
GliUCOAMYIiASE SI (EC 3.2.1 3 ) 


295 


71, 


483 


AF161381 


Homo sapiens 


HSPC263 

• M mm mm ^mw u~ ^m 


1404 


100 


484 


AF223468 


Homo sapiens 


AD021 protein 


1314 


100 


486 


X57527 


Homo s aniens 

m m • % m^+^ wmw wm ^m+m m ^m 


aloha 1 (VII I ) collaoei 


4166 


99 


487 


Y19062 


Homo s aniens 


39)c3 orotein 

mm wmrmfmmm* mm ^mm* W -mm* M m 


2475 


3 00 


488 


Y73373 


Homo sapiens 


HXRM clone 921803 protein 
sequence . 


555 


56 


489 


AL021918 

■m mAM mm ^m^* mm 


Homo 
sapiens 


b34I8 1 (KruDt>el related Zinc 
Fineier Drotein 184) 


41 84 


100 

«l V v 


490 


X53773 


Rattus 
norvegicus 


alpha- c larcre chain (AA 1- 1 
938) 


4675 

^m \mf ¥ -mm 


97 


491 


U52426 


Homo sapiens 


GOK I 


1459 


59 


492 


AL359773 


Leishmania 
major 


possible threonine synthase 


702 


45 


493 


AF226614 


Homo sapiens 


f err oport inl 


2929 


100 


494 


Z93241 


Homo sapiens 


dJ222El3.l (novel protein I 
with some similarity to 
Drosophila KRAJOiN) | 


513 


96 


495 


AF036977 


Homo sapiens 


unknown I 


1812 


100 


496 


U93564 


Homo sapiens 


p40 


133 


45 


497 

• 


Y91405 


Homo sapiens 


Human secreted protein 
sequence encoded by gene 2 
SEQ ID NO: 126. " \ 


357 

mW 10 W 


100 


498 


AF069781 


Drosophila 
tnelanogaster 


Bem46-like protein [ 


6S3 


43 


499 


Y16601 


Homo sapiens 


Human cell-cycle 
phosphoprotein CECYP-2. | 


1658 


98 


500 


X70944 


Homo sapiens 


PTB- associated splicing } 
factor j 


3883 


100 


501 


AF027503 


Mus 

mueculus 


putative membrane -associated j 
guanylate kinase 1 | 


205 


36 


502 


AF282S74 


Homo sapiens 


nectin 3; PRR3 j 


2856 


99 


503 


AJ24 9732 


Homo sapiens 


G8 protein | 


669 


100 j 


504 


AF208861 


Homo sapiens 


BM-019 j 


1629 


100 


505 


L09708 


Homo sapiens 


complement component C2 | 


4 022 


100 


507 


X66285 


Mus tnusculus 


HC1 ORF | 


115 


43 


508 


D00189 


Rattus 
norvegicus 


Na+, K+-ATPase alpha-subunit 


S227 


99 



J55 
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TABLE 2 



SEQ 
ID 

WO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


am i rl 
WATERMAN 
SCORE 


IDENTITY 


509 


Y94971 


Homo sapiens 


Human secreted protein clone , 
ID NO-148 


2176 


100 


510 


A3019038 


Homo sapiens 


beta- 1,4 mannosyl transferase j 


781 


77 


all 


j\a u x 3vju 






1347 


100 


CI o 






hpfa-1 4 mannnwl transferase 


1S20 


99 


513 


X84908 


Homo sapiens 


phosphorylase kinase 


5729 


99 






nOiuO oapJ.ciI^> 


pcpLiuyiproiyj. Aoowctaoc 


650 

w w w 


76 


515 


AF1860B4 


nomo 
sapiens 


epiueruiai. growtn iawcui 
repeat containing protein 


3046 


99 


516 


G03602 


Homo sapiens 


numan secretea proteiiiy *>&w 
ID NO: 7683. 


enc 




517 


U04706 


Bos taums 


bu KDa protein 




"7*7 


518 


G00653 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 4734 . 


530 


100 


519 


AF161475 


Homo sapiens 


HSPC126 


1368 


100 


520 


Y99366 


Homo sapiens 


Human PR014 75 (UNQ746) amino 
acid sequence SEQ ID NO: 88. 


3394 


97 


521 


AF2668S2 


Homo s ap i ens 


PTPIA 


1295 


100 


522 . 


AE000995 


Archaeoglobu 
s fulgidus 


chromosome segregation 
protein (smcl) 


153 


20 


523 


AF0S2249 


Homo sapiens 


immunoglobulin heavy chain 
variable regxon 


605 


S7 


524 


AJ223830 


Rattus 
norvegicus 


ARE1 


2950 


98 


525 


W01535 


Homo sapiens 


Cellular homologue of the 
SV40 large T antigen. 


1276 


83 


526 


AF145658 


Drosopbila 
me 1 anogas ter 


BcDNA . GH10 2 2 9 


320 


33 


527 


AF112213 


Homo sapiens 


putative Ral>5- interacting 
protein 


524 


79 


523 


D49387 


Homo 
sapiens 


NADP dependent leukotriene b4 
1 2 - hy droxy de hyd rogena s e 


1616 


100 


529 


Y30819 


Homo sapiens 


Human secreted protein 
encoded from gene 9 . 


328 


32 


530 


AL079335 


Homo sapiens 


dJ132F21.3 (72.1 KDa protein 
fDKFZP564A03 2, SBBI88) 
similar to mouse I FN -gamma 
induce MG11 . ) 


1059 


99 


531 


Y91506 


Homo sapiens 

- 


Human secreted protein 
sequence encoded by gene 56 


1159 


q a 


532 


X76116 


Caenorhabdit 
is eicgans 


carrier protein vc^j 


3 / O 


so 


r- "3 1 

533 


a7o11d 


taenoinoDui c 
is elegans 


<wdX. i. i P^ kJt.tiJ.ll \ / 


506 


50 


534 




homo sap x ens 


n , roner>t' ide (4 24 AA) 


1972 


100 


535 


Y09267 


Homo sapiens 


flavin- containing 

monoQxvoenase 2 


2486 


100 


536 


Z11773 


Eomo sapiens 


SRE-ZBP 


2201 


99 


S3 7 


D84224 


Komo sapiens 


methionyl tRKA synthetase 


4741 


99 


538 


D84224 


Homo sapiens 


methionyl tRNA synthetase 


3887 


99 


53 9 


D842Z4 


Homo sap i ens 


mecnxonyi cKiNA synenccase 


£* ^ ^ ^ 


96 


540 


D84224 


Homo sapiens 


methionyl tRNA synthetase 


4529 


99 






□ ut> uuuj.uo 


H+ ATPase 31kDa subunit (EC 
3.6.1.3) 


848 


77 


54 2 


Y92514 


Homo sapiens 


Human 0XRE-11. 


2301 


99 


543 


AF221712 


Homo 
sapiens 


Smad- and Olf -interacting 
zinc finger protein 


2151 


61 


544 


AE000919 


Methanobacte 
rium 

thermoautotr 
ophicum 


conserved protein 


207 


38 


545 


A06669 


synthetic 
construct j 


preTGP-betal 


2070 


99 



156 
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TABLE 2 



SEQ 
ID 

NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH - 
WATERMAN 
SCORE 


* 

X t-ftZtXH 111 I 


54 6 


Y02698 


Homo sapiens 


Human secreted protein 
encoded by gene 49 clone 
HTPCS60 . 


854 


98 


54 7 


AF112205 


Homo sapiens 


WSB-l protein 


2275 


100 


54 8 


X60271 


Mus mus cuius 


c-rel 


2264 


74 


54 9 


AC01682 7 


Arab iclops is 
thaliana 


putative GTPase 


810 


42 


550 


Y70400 


Homo 
sapiens 


Human cell- signalling 
protein- 2 . 


429 


68 


551 


AB048365 


Homo sapiens 


NEDD4-Iike ubicuitin ligase 1 


i 8290 


99 


552 


Y57880 


Homo sapiens 


Human transmembrane protein 
HTMPN-4 . 


1112 


95 


553 


AF119855 


Homo sapiens 


PR01847 


265 


67 


554 


M17236 


Homo sapiens 


MHC HIA-DQ alpha precursor 


1332 


100 


555 


AL078468 


Arabidopsis 
thaliana 


putative protein 


54 0 




556 


AC006963 


Homo sapiens 


similar to Kelch Droteins* 
similar to BAA77027 
(PID:g4650844) 


515 




557 


AK024487 


Homo sapiens 


FLJ00086 protein 


1623 


98 


558 


M12140 


Homo sapiens 


tjoI cf pnp orot^in • Yirv 






559 


W74825 


Homo sapiens 


Human secreted protein 
encoded by gene 97 clone 
HAOBF73 


225 


56 


560 


X565B1 


Homo saoi enq 


"i 1 1 n n t-s h p i n 


T -IT 

3 / J 


O Q 

88 


561 


AF003136 


Caenor habd i t 
is eleerans 


r*on ^ ;» "i T"» C WO J^V* -i m -i T -y S i~ -\ r t- r-\ 
U-WJlLaXLLs nCa\ 23jL.EU-L.LclX. JL ty CO 

an iMP-hindinfi mri t* "i f 






562 


AL109839 


Homo saDiens 


dJ1069P2 3 1 ( nnvp 1 PJVRPCT 
(doI v (A) -bindincr tiroheiril 


R "7 *7 


i fin 
-L U U 


563 


AF181640 


Drosophila 
melanogaster 


BcDNA.GH098l7 






564 


AF052723 


Feline 

leukemia 

virus 


aaq-ool precursor oolvorotein 
gPr80 


154 7 


43 


565 


AF161472 


Homo sapiens 


HSPC123 


439 


44 


566 


Y28817 


Homo sapiens 


pt326_4 secreted protein. 


3338 


ioo ! 


567 


U09848 


Homo sapiens 


zinc finger protein 


1738 


100 j 


569 


AF155113 


Homo sapiens 


NY-REN- 55 antigen 


3603 


93 \ 


570 


AF155113 


Homo sapiens 


NY-REN- 55 antigen 


3951 


99 J 


571 


AL032821 


Homo sapiens 


dJ55C23.1 (vanin 1) 


1821 


9P 1 

1 


572 


M69181 


Homo sapiens 


non-muscle myosin B 


73 50 


99 1 


573 


M69181 


Homo sapiens 


non-muscle myosin B 


7311 


98 J 


574 


Y59678 


Homo sapiens 


Secreted Drotein 108-008-5-0- 
E6-PI,. 


772 


10 0 I 


575 


AL365234 


Arabidopsis 
thaliana j 


putative orotein 


788 


40 1 


576 


AL3 65234 


Arabidopsis 
thaliana 


pucative protein 


788 


40 


577 


X0674S 


Homo sapiens 


DMA polymerase alpha -subunit 
(AA 1 - 1462) 


7619 


99 


578 


AB041642 


Homo sapiens 


PAR- 6 


1342 


ioo | 


579 


D86984 


Homo sapiens 


similar to yeast adenylate 
cyclase (SS6776) 


2446 


100 


580 


AF165124 


Homo sapiens 


gamma -aminobutyric acid A 
receptor gamma 2 


2499 


99 j 


581 


W88812 


Homo sapiens 


Polypeptide fragment encoded 
by gene 58 . 


2339 


99 " 


582 


D82319 


Homo sapiens 


novel 0RF 


342 


100 1 


583 


P92219 


Homo sapiens 
(human) 


CRi protein. 


11425 


99 j 


584 


AJ223948 


Homo sapiens 


RNA helicase 


6608 


99 i 


585 


Y08612 


Homo sapiens 


BBkOa nuclear pore complex 
protein 


3874 


99 j 


586 


Y42384 


Homo 
sapiens 


Amino acid sequence of 
Iv3l0 7. 


1007 


37 j 


587 I 


AF129756 


Homo sapiens 


BAT4 


1873 


98 1 



157 
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TABLE 2 



SEQ 
ID 

NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


% 

IDENTITY 


588 


AF131775 


Homo sapiens 


Unknown 


1929 


99 


S89 


AJ2S0865 


Homo sapiens 


TESS 2 


2348 


100 


591 


Z98885 


Homo sapiens 


dJ522J7.2 (bromodoma in- 
containing 1 (similar to 
peregrin, BR140) ) 


4167 


100 


592 


L76571 


Homo sapiens 


nuclear hormone receptor 


1355 


100 


593 


AF091622 


Homo sapiens 


PHD finger protein 3 


9054 


100 


594 


XS6807 


Homo sapiens 


desmocollin type 2a 


4443 


100 


595 


AL137802 


Homo sapiens 


dJ798A10.1 (novel protein) 


212 


55 


596 


AIj022329 


Homo 
sapiens 


bK407F11.2 (adrenergic, beta, 
receptor kinase 2) 


3653 


100 


597 


AF226048 


Homo sapiens 


GL003 


2009 


99 


598 


AJ278112 


Homo 
sapiens] 
>Y49635 
Y49635 21- 
OCT-1999 15- 
APR-1998 
Human sdp3 . 5 
protein. 
[Homo 
sapiens 


putative cell cycle control 
protein 


335 


23 


599 


YS9741 


Homo sapiens 


Human normal ovarian tissue 
derived protein 10 . 


1574 


99 


600 


L36531 


Homo sapiens 


integrin alpha 8 subunit 


5386 


99 


601 


Y38458 


Homo sapiens 


Human secreted protein 
encoded by gene No. 20. 


895 


100 


602 


AF218584 


Homo sapiens 


GGA1 | 


3265 


lOO 


603 


Y1311S 


Homo sapiens 


serine /threonine protein 
kinase 


S071 


99 

• 


604 


AL132776 


Homo sapiens 


dJ3 93D12 . 1 ( KIAA0 7 76 ) 


2413 


99 


605 


AL0344S2 


Homo sapiens 


dJ682J15.1 {novel Collagen 
triple helix repeat 
containing protein) 


1979 


100 


606 


Y14494 


Homo sapiens 


aralarl 


| 3465 


99 


607 


AJ001981 


Homo sapiens 


OXA1L 


2603 


100 


608 


X86098 


Homo 
sapiens 


binds directly to adenovirus 
type 5 E1A protein 


3069 


100 


610 


AF163572 


Homo sapiens 


Forssman glycol ipid 
synthetase 


1865 


99 


611 


AF161503 


Homo sapiens 


HSPC154 


1261 


97 


612 


L41834 


Ensis minor 


nuclear protein 


345 


30 


613 


Y919S4 


Homo sapiens 


Human cytoskeleton associated 
protein 9 (CYSKP-9) . 


3668 


100 


614 


AL022327 


Homo sapiens 


dJ35SC18.1 (KIAA0027) 


361 


94 


615 


X8S786 


Homo sapiens 


binding regulatory factor 


3203 


100 


616 


Y08319 


Homo sapiens 


kinesin-2 


3487 


99 


617 


D12644 


Mus musculus 


KIF2 protein 


3609 


97 


618 


U28789 


Mus musculus 


PACT 


5936 


89 


619 


Y35914 


Homo sapiens 


Extended human secreted 
protein sequence, SEQ ID NO. 
163. 


1684 


99 


620 


AB0463B2 


Mus musculus 


test is- abundant linger 
protein 


199 


23 


621 


Y00062 


Homo sapiens 


precursor polypeptide (AA -23 
to 1120) 


3440 


99 


622 


AF068286 


Homo sapiens 


HDCMD38P 


861 ! 100 


623 


X98248 


Homo sapiens 


sortilin 


4436 


99 


624 


X611O0 


Homo sapiens 


75 kDa subunit NADH 
dehydrogenase precursor 


3734 


99 


625 


S58544 


Homo sapiens 


75 kda infertility- related 
sperm protein 


2125 


99 


626 


AF151027 


Homo sapiens 


HSPC193 


582 


93 


627 


X14968 


Homo sapiens 


RH-alpha subunit (AA 1-404) 


2079 


100 


628 


Y50911 


Homo sapiens 


Human fetal brain cDNA clone 
vb7 l derived protein 


1983 


100 
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TABLE 2 
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SEQ 

JLU 
NO* 


ACCESSION 


SPECIES 


DESCRIPTION 


SMITH- 

WAI C.RMAN 


V 

IDENTITY 


1 629 

w (fi -7 


Y50911 




Human f"^r*54l Kva^ n «^tvmr r*l nrtA 

vb7_l derived protein 


■LD 3"* 






AF098786 


Hnnin 

saoiens 


"1*7 V^o f~ — hvH tt^i t~ v/-! i 
a » ucv,a iiyuxuAyoUcruiu 

dehvdroaenase tvoe VT I 


J. / 3 *± 




631 


AI>034555 


Homo 
sapiens 


dJ134019.3 (zinc finaer 
protein 151 (pHZ-67)) 


4273 


TOO 

-l- w \J 


632 


W74826 


Homo sapiens 


Human secreted protein 
encoded by gene 98 clone 
HAQBT94 . 


794 


\ 96 


633 


AF288288 


Homo sapiens 


HPT protein 


2236 


. 100 


634 


AF041429 


Homo sapiens 


pRGRl 


823 


99 


635 


X66357 


Homo sapiens 


serine/ threonine orote in 
3einase 


1589 


10O 


636 


Y11284 


Homo sapiens 


AFX1 


2571 


98 


637 


AB0048B4 


Homo sapiens 


| PKU-alpha 


3718 


99 


63 8 

V V VI 


AiJO 02303 


Homo ^a*oi p»n k 






inn 

X. VI u 


639 


AJ002304 


Homo sapiens 


synaptogyrin lb 


1002 


100 








^yocipLogyxrjLri jl c 


Q ~i 1 


Qyt 


D fi J- 


no. 1 C O "5 
Do / b 0 £. 


norao sapiens 


sj.iuxj.ar lo a i_.ej.ee/ans 
protein encoded .in cosmid 


£.K* /O 


J.UU 


642 


M14660 


Homo sapiens 


ISG-K54 


2473 


99 


64 3 


X06661 


Homo sapiens 


calcmdin ;AA 1-261) 


T ICO 

1358 


100 


644 


AF119900 


Homo sapiens 


PR02822 


185 


76 


645 


AB031048 


Drosoph.il a 
melanogaster 


microtubule associated- 
protein orbit 


738 


27 


646 


AF2S0842 


Drosophila 
melanogaster 


multiple asters 


834 


29 


647 


X86691 


Homo sapiens 


Mi- 2 protein 


10110 


99 


648 


U67934 


Homo sapiens 


44.9 kDa protein C18B11 
homolog 


827 


96 


649 


AF236061 


Oryctolagus 
cuniculus 


RING- finger binding protein 


3 330 


91 


650 


AL034553 


Homo sapiens 


dJ914P20.2 (KIAA0784 protein 
similar to Mus musculus 
act ivi ty- dependent 
neuroprotective protein 
(Aclnp) ) 


5708 


100 


653 


X14766 


Homo sapiens 


GABA-A receptor alpha 1 
suDunic 


*> r* o o 
23 DO 






HLUUn Di4 


norno sapiens 


S2.nm.ar co r-sponuin proteina 
Anftft£ftfi£ ( PT"n •er2 ( %'?Q22S ^ 


_> vi <<S O 


99 


655 


Y57908 


Homo sapiens 


Human transmembrane protein 
HTMPN-32 


608 


99 


656 

W ft* *-F 


Z34975 


• Homo saoiens 


ldlCp 


3733 


100 


658 


AL050306 


Homo sapiens 


dCT475B7.2 (novel protein) 


1942 


99 


659 

W «y w*» 


W76734 


Homo 
sapiens 


Human mDia Rho taircret ina 
protein. 


781 


34 


660 


AF202724 


Homo sapiens 


Sadl uhc-84 domain protein 1 


2172 


100 


661 

V V -»w 


Z21966 


Homo saoiens 


mPOU homeobox Drotein 


1529 


100 


662 

W V ffc 


AJ242954 


Mil ^ mil ctl 1 \1 R 


wy fc> a. w a. x ^ a 


4752 


59 


663 


AF182316 


Homo sapiens 


myof erl in 


6232 


99 




Able JIDX O 


fr"Vi;%l 4 


nypocne Licax proucin 


0 ft Q 


3D 


DO/ 




W<"irr%#"\ cnT\^ avi c* 

riLnrio 9dp^cHo 


Vaiyi- tlvNii SyXlCI16ucLoe 




99 


668 


Y1335S 


Homo saoiens 


Am infj ar^id «3.*>cm^Tir'f> of 

protein PRO220 . 


3692 


100 


669 


AB010692 


Arabidopsis 
thai i ana 


contains similarity to endo- 

beta-N-acetylglucosaminidase 

gene 


611 


S2 


671 


XS6123 


Kus musculus 


talin 


4474 


76 


672 


AB039371 


Homo sapiens 


mitochondrial ABC transporter 
3 


2902 


99 


673 


AF269223 


Homo sapiens 


TCP11 


806 j 


42 


674 


AF229633 


Mus musculus 


groucho-related protein 4 


4053 


99 


675 


L14463 


Rattus 


■ trans due in 


3619 


92 
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SEQ 
ID 
NO: 



676 



677 



678 



679 
"680" 



681 



682 



683 



684 



685 



686 



687 



688 
"689" 



690 



691 



694 



695 



696 



697 



698 



699 



701 



702 



703 



704 



705 



706 



707 



708 



709 



710 



711 



TABLE 2 



ACCESSION 
NUMBER 



SPECIES 



DESCRIPTION 



AC005757 



norvegicus 
Homo sapiens 



R32611 1 



S61069 



AF271388 



Homo sapiens | reverse transcriptase 

homolog=pol {retroviral 

el ement ) ' 

CMP~N-acetylneurarainic acid 

synthase 



Homo sapiens 



X79066 
AF118566 



Homo sapiens 
Mus rcus cuius 



ERF-1 

hematopoietic zinc finger 
protein 



Y51415 



Homo 
sapiens 



Human wild type pKe83 
protein . 



SMITH- 
WATERMAN 
SCORE 



IDENTITY 



2779 



100 



252 



65 



2273 



100 



1783 



100 



769 



50 



"2T2T 



99 



AL133545 



Homo sapiens 



bA386N14.1 (novel protein 
similar to a dual specificity 
phosphatase) ^ 



Y86214 



Homo sapiens 



Nuclear transport protein 
clone hfb341 protein 
sequence 



Y94952 



AL021878 



Homo sapiens 



Homo sapiens 



Human secreted protein clone 
fhll6_ll protein sequence 
SEQ ID NO: 110. 
LJ257I20.4 (transcription 
factor 20 (AR1) (KIAA0292) 
(isoform 2) ) 



AE000198 



orf, hypothetical protein 



M58378 
AF03969 7 
U09355 



Escherichia 

coli _ 

Homo sapiens | synapsiiPl 
Homo sapiens | antigen NY- CO- 31 



Oryctolagus 
cuni cuius 



protein phospl 
gamma subunit 



itase 2A1 B 



AF155106 I Homo sapiens \ NY- REN- 3 6 antigen 



AC004774 | Homo sapiens | Dlx-5 



X90S30 
X90530 



Homo sa piens | ragB 
Homo sapiens j ragB 



X90530 



Homo sapiens | ragB 



G01563 



AC011810 



Homo sapiens | Human secreted protein, SEQ 

ID NO: 5644 . 

Arabidopsis | Putative methionine 
thaliana | aminopeptidase 



AJ2S0425 



Rattus 
norvegicus 



Collybistin I 



AB037901 



Homo 
sapiens 



gene amplified in squamous 
cell car ci noma -1 



Y994 01 



Homo sapiens 



AF221712 



Homo 
sapiens 



Human PR01327 (UNQ687J atru.no 
acid sequence SEQ ID NQ:218. 
Smad- and Olf -interacting 
zinc finger protein 



X83 573 



Homo sapiens I ARSE 



AJ243274 I Homo sapiens | AP-2rep protein 



Y71262 



Homo sapiens [ Human chondlromodulin-like 

protein , Zchml . 



Y71262 



Homo sapiens | Human chondromodulln-lilce 

protein, Zchml . 



Y41257 



AL022237 



Homo sapiens ( Amino acid sequence of long 

human FAIM . 

Homo sapiens jbK119lB2.3 ( PUTATIVE novel 

Acyl Transferase similar to 
C. elegans C50D2.7) (isoform 



700 



68 



5888 



99 



354 



154 



98 



67 



628 



100 



3730 
508" 
2356 



99 
98 



99 



26S 
1542 



50 
100 



1926 
1405 



99 
99 



1590 



330 



669 



2455 



5364 



1386 



6705 



3184 



2078 



1697 



1736 



AJO06266 



Homo sapiens 



G01571 



Homo sapiens 



Y08698 



Homo sapiens 



Y68770 



Homo sapiens 



1) ) 



AND-1 protein 



Human secreted protein, SEQ 
ID NO: 5652. 



ranbp3 



Amino acid sequence ot a 
human phosphorylation 
effector PHSP-2. 



1060 



2030 



5942 



777 



2849 



754 



85 



100 



52 



98 



99 



100 



100 



99 
99" 



94 



99 



100 



100 



100 



99 



98 



99 
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TABLE 2 
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SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH - 
SCORE 


* 








Diitar ivp niso 


799 


59 


713 


AC004531 


Homo sapiens 


Gene with similaity to DEAD 
box helicases 


2715 


99 


714 


D89016 


Homo sapiens 


Neuroblastoma 


538 


48 


715 


Y92175 


Homo sapiens 


Human cardiovascular system 
associated protein tyrosine 
phosphatase 2 . 


734 


98 


716 


T\T "1 Q ft "1 "3 

AJjIJ /Ul J 


riomo sapiens 


phosphor ibosyl t ran f erase ) 


B62 

W W m» 


100 


717 


AB035123 


Mus raus cuius 


gdi aipna/vjiia aj.pna / jjd 
ajLpjia synutase 


1D3 D 




718 


196290 


HOmo >P40^do4 
P40254 25- 
OCT-1984 09- 
APR-1983 
Human IgD . 
[Homo 
sapiens 


ri u.nuc_L n XKyc Mf J — ^ liisuuiiugiuiJUi in . 


A J M W 




719 


X07979 


Homo sapiens 


integrin beta 1 subunit 
precursor 


4347 


99 


720 


AJ224819 


Homo sapiens 


tumor suppressor 


2149 


99 


721 


Y07595 


Homo sapiens 


transcription factor TFIIH 


2373 


100 


722 


W41565 

• 


Homo 

sapiens] 

>W41S64 

W41564 08- 

OCT-1997 05- 

APR-1996 

Human 

calpain . 

[Homo 

sapiens 


Human calpain. 

♦ 


1591 


99 


723 


AF161341 


Homo sapiens 


HSPC078 


1097 


98 


724 


AF187318 


Homo sapiens 


F-box protein Fbx2 


1607 


100 


725 


AC006708 


Caenorhabdi t 
is elegans 


contains simlarity to 
Saccharomyces cerevisiae pre- 
mRNA splicing protein PRP31 
(GB.-Z72876) j 


1143 


46 


726 


AC006708 


Caenorhabdi t 
is elegans 


contain© 3imlarity to 
Saccharomyces cerevisiae pre- 
mRNA splicing protein PRP31 
(GB : Z72876; 


988 


46 


727 


AC024818 


Cae norhabdi t 
is elecans 


contains similarity to Ptam 
G-beta repeat), ecore-81.8, 


950 


44 


728 


AJ005897 


Homo sapiens 


JM5 


831 


47 


72 9 


Y453 7 / 


riomo sapiens 


riuuicill SCtiCLcu ^iiuucjiii 

fragment encoded from gene 

97 


908 


97 


730 


G03931 


Homo sapiens 


Human oecreted protein, SEQ 
ID NO: 8012. 


578 


100 


731 


AB012 720 


uncornyncDus 
ma sou 




3 865 


76 


732 


W73404 


Homo sapiens 


Human secrecea protein 
encoded by Gene No . 8 . 


862 
o o* 


97 


/ j j 




Unmn qani en «5 


Human secreted protein. SEQ 
ID NO: 6731. 


644 


97 


734 


AC024 813 


caenorhabdi t 
is elegans 


Hypothetical protein 
Y54Fl0AL.a 


152 


24 


735 


AL035461 


Homo sapiens 


dJ967N21.6 (novel CDP-alcohol 
phospha tidyl trans f era s e 
family member protein) 


1562 


98 


736 


U00033 


Caenorhabdit 
is elegans 


similar to S. cerevisiae YJU2 
protein 


605 


41 


737 


AF07909B 


Homo 
sapiens 


arginine- tRNA-protein 
transferase l-.lp; ATEl-lp 


2733 


99 



361 



BNSOOCID: <WO. 



0153312A1_I_> 



WO 01/53312 



PCTMJS00/34263 



TABLE 2 



SEQ 
ID 

NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 

WfiTPRMAN 
SCORE 


V 

"nPMTTTV 


738 


AJ131712 


Homo sapiens 


nucleolar RNA-helicase 


2793 


100 


739 


AJ133115 


Homo sapiens 


TSC-ZZ-iiKc protein 


2054 


99 


740 


X98258 


Homo sapiens 


M-phase phosphoprotein 9 


953 


100 


741 


X98258 


Homo sapiens 


M-pnase pnospnoprocem _» 




74 


742 


U97191 


Caenorhabdi t 
is elegans 


strong similarity to cne xirii 
sub- family of RAS proteins 


gen 

7 w w 




743 


X76057 


Homo sapiens 


phosphomannose i some rase 


_____ ___ 


100 


744 


G03209 


Homo sapiens 


Human secreted protein, otu 

ID NO: /29IJ . 






745 


X97064 


Homo sapiens 


Sec23 protein 


4034 


99 


746 


W93946 


Homo sapiens 


Human regulatory molecule 
HRM-2 protein. 




ion 


747 


Y733B8 


Homo sapiens 


HTRM clone 3376404 protein 
sequence . 


1565 


99 


748 


M19529 


Sus scrota 


follistatin A 


t one 


O Q 


749 


AJ24 9457 


Trichomonas 
vaginalis 


centrin, putative 


"I ft "5 


"> A 

_o 


750 


AC0044IO 


Homo sapiens 


fos3 9554 1 




inn 

J.UU 


751 


AF074968 


Homo sapiens 


P47ING3 protein 


2167 


100 


752 


AF252284 


Homo sapiens 


transcription speciticity 
factor Spl 


rt c 


inn 

1U u 


753 


AB049629 


Homo sapiens 


phospho 1 ys i ne 

phosphohistidine inorganic 
pyrophosphate phosphatase 


1375 


99 


754 


D79205 


Homo sapiens 


ribosomal protein 1»3 3 


lb U 


1 "7 
/ V 


755 


AB008430 


Homo sapiens 


CDEP 


142 


29 


758 


1*32162 


Homo sapiens 


transcription factor 


574 


80 


759 


AF037204 


Homo sapiens 


RING zinc finger protein 


295 


54 


760 


Y44250 


Homo 
sapiens 


Human cell signalling 
protein- 13 . 


"625 


100 


761 


AF218586 


Homo sapiens 


Cide-b 


1136 


100 


762 


U38934 


Gallus 
gallus 


his tone H2A 


625 


97 


763 


AF226053 


Homo sapiens 


HSKM-B 


606 


32 


764 


X13403 


Homo sapiens 


Oct-1 protein (AA 1 - 743) 


3626 


100 


765 


D87446 


Homo sapiens 


Similar tc a C. elegans 
protein encoded in cosnua 
C27F2 (U40419) 


568 


38 


766 


AL023828 


Caenorhabdi t 
is elegans 


Y17G7B. 14 


2UO 


_ / 


767 


Y82777 


Homo sapiens 


Human chordin reiatea protein 
(Clone dw665 4) . 


__>ni 




768 


X92475 


Homo sapiens 


ITBA1 




10 0 


769 


Y42752 


Homo sapiens 


Human calcium oinaing protein 
3 (CaBP-3 / . 




1O0 


770 


X51416 


Homo sapiens 


hormone receptor nERRl (AA 1- 
521) 


2641 


"97 


771 


AJ006591 


Homo sapiens 


cysteine-ncn protein 




100 


772 


A08695 


Homo sapiens 


rap2 


935 


100 


773 


Z12173 


Homo sapiens 


N-acetylglucosamine-' £ - 
sulphatase 


2970 


100 


774 


Y91950 


Homo sapiens 


Human cytoskeleton associated 
protein x> iv»ioi_r -> / . 


565 


43 


776 


AL023799 


Homo sapiens 


dJ322P7.1 (zinc finger) 


855 


56 


777 


AL023799 


Homo sapiens 


dJ322P7.1 (zinc linger/ 


Cj —J 


56 


i/O 


V7 \J X. O O w 




Human secreted protein, SEQ 
ID NO: 5961 . 


84 9 


98 


779 


AJ012590 


Homo sapiens 


glucose 1- dehydrogenase 


4155 


99 


780 


AL078582 


Homo sapiens 


dJ130E4 . 2 (KIAAD796 ) 


1321 


68 


781 


Z7S95S 


Caenorhabdi t 
is elegans 


Similar to mitochondrial 
carrier protein 


384 


34 


782 


AL109965 


Homo 
sapiens 


dJH21Gl2.2 (SCAN domain- 
containing 1 protein) 


900 


1C0 


783 


AF061262 


Mus 

mus cuius 


semaF cytoplasmic domain 
associated protein 2 


1316 


83 


784 


G03873 


Homo sapiens 


Human secreted protein, SEQ 


649 


95 



162 
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TABLE 2 



SEQ 
ID 

NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRI PTION 


SMITH- 
WATERMAN 
SCORE 


% 

IDENTITY 








ID NO: Y954 - 






785 


Y84441 


Homo sapiens 


Amino acid sequence of a 
numan kna- a s socib c eo 
protein. 


2074 


100 


78b 


vnnoi o 


Homo sapiens 


Human nap protein, KAor-i, 
protein sequence • 




99 


787 


Z97029 


Homo sapiens 


ribonuclease HI large subunit 


1548 


99 


788 


AB0353 84 


Homo sapiens 


SRp25 nuclear protein 


96 2 


S4 


789 


AF024631 


Homo sapiens 


ANG2 


2644 


100 


790 


AJ006710 


Rattus 
norvegicus 


phosphatiayl inositol 3 -kinase 


4508 


97 


792 


V00638 


bacteriophag 
e lambda 


reading frame ealO 


600 


100 


793 


AF049103 


Homo sapiens 


Huntingtin interacting 
protein 


819 


100 


795 


Z26317 


Homo sapiens 


desmoglein 2 


4810 


99 


796 


Y76884 


Homo sapiens 


Retinoblastoma binding 
protein- 7sequence . 


5080 


99 


797 


U15155 


Gallus 
gallus 


trypsinogen 


372 


37 


798 


U97189 


Caenor habdi t 
is el eg an s 


strong similarity to thw 
P13/P14 family of kinases 


227 


28 


799 


AF112201 


-Homo sapiens 


neuronal protein NP25 


1053 


100 


800 


AF234765 


Rattus 
norvegicrus 


serine- arginine- rich splicing 
regulatory protein SRRP86 


958 


63 


801 


AF267852 


Homo sapiens 


placental protein 13 -like 
protein 


743 


99 


802 


AF208851 


Homo sapiens 


BM-009 


766 


80 


803 


Z81097 


Caenorhabdit 
is elegans 


Similarity to Human 
retinoblastoma -binding 
protein RBAP46 yk662dl2-5 
comes from this gene 


152 


27 


804 


G02113 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 6194. 


496 


98 


805 


AL121673 


Homo sapiens 


DA305P22.1 (novel protein) 


1160 


1C0 


806 


AC013483 


Arabidopsis 
thaliana 


putative GTPase activator 
protein 


264 


30 


807 


AC013483 


Arabidopsis 
thaliana 


putative GTPase activator 
protein 


264 


3C 


808 


AB013885 


Homo sapiens 


beta-urei dop r op i ona s e 


1494 


100 


809 


AF078842 


Homo sapiens 


HOTTL protein 


1581 


99 


810 


AF161421 


Homo sapiens 


H3PC303 


2134 


96 


811 


AF261689 


Homo sapiens 


DNA polymerase epsilon pl7 
subunit 


734 


100 


812 


Z74029 


Caenorhabdit 
is elegans 


Similarity to C. elegans 
alcohol dehydrogenase comes 
from this gene 


610 


71 


813 


Z73497 


Homo sapiens 


CU24 0C2.2 (Core his tone 
H2A/H2B/H3/H4 ) 


324 


100 


814 


W87689 


Homo 
sapiens 


Human HTXFT19 polypeptide. 


1484 


99 


815 


X16282 


Homo 
sapiens 


zinc finger protein (217 AA) 
(1 is 2nd base in codon) 


1109 


99 


816 


Z92539 


Mycobacteria 
m 

LuDcrCUlOSlS 


pth 


3 00 


36 


818 


AB030483 


Mus musculus 


B9 


197 


27 


819 


AL117555 


Homo sapiens 


hypothetical protein 


321 


94 


820 


AC005328 


Homo sapiens 


R26660_2, partial CDS 


865 


97 


821 


G03951 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 8032. 


700 


99 


822 


L34807 


Musca 
domestica 


transposase 


174 


20 


823 


G02928 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 7009. 


558 


78 


824 


Z99531 


Schizosaccha 


caffeine -induced death 


184 


29 
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TABLE 2 



SEQ 
ID 

NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 

w t ^ rwww^T^ % M n ^.w 

WATERMAN 
SCORE 


k 

IDENTITY 






rainy ces 

• 

pombe 


protein 1 






325 


AJ006692 


Homo sapiens 


ultra high sulfer keratin 


693 


68 


826 


U23037 


Oryctolagus 
cuni cuius 


elF- 2Bepsilon 


3406 


90 


827 


G03412 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 74 93 . 


464 


100 


828 


Y30327 


Homo sapiens 


Human secreted protein 
encoded from gene 17 . 


113 


44 


829 


Y32199 

- 


Homo sapiens 


Human receptor molecule (REC) 
encoded by Incyte clone 
2022379. 


1012 


100 


830 


W78279 


Homo sapiens 


Fragment of human secreted 
protein encoded by gene 33. 


1264 


99 


832 


AB011542 


Homo sapiens 


MEGF9 


2097 


100 


833 


G02639 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 6720. 


223 


70 


834 


AF119664 


Homo sapiens 


transcriptional regulator 
protein HCNGP 


1574 


100 


835 


AF119664 


Homo sapiens 


transcriptional regulator 
protein HCNGP 


1144 


89 


836 


AF119664 


Homo s ap i ens 


transcriptional regulator 
protein HCNGP 


1448 


94 


837 


X12517 


Homo sapiens 


C protein (AA 1-159) 


918 


100 


838 


U32865 


Drosophila 
melanogaster 


linotte protein 


164 


24 


839 


AF067730 


Homo sapiens 


TTjS- associated protein TASR-2 


631 


56 


840 


U27831 


Homo sapiens 


stria turn- enriched phosphatase 


2840 


98 


841 


AF286366 


Homo sapiens 


CamKI-like protein kinase 


1796 


1Q0 


842 


G02309 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: S390. 


278 


98 


842 


AE003615 


Drosophila 
melanogaster 


ade3 gene product 


113 


48 


844 


G01350 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 5431 . 


629 


100 


845 


U27838 


Mus mus cuius 


glycosyl -phosphatidyl - 
inositol -anchored protein 
homolog 


3305 


96 


847 


Y87788 


Homo sapiens 


Human RBP-26 protein. 


2026 


100 


84 8 


AP164794 


Homo sapiens 


Diff33 protein homolog 


2398 


100 


849 


U41315 


Homo sapiens 


ZNF127-Xp 


2458 


93 


850 


AF192784 


Homo sapiens 


— 1,1 " — ■ 

makorm 1 


2062 


97 


85X 


. Y58628 


Homo sapiens 


Protein regulating gene 
expression PRGE-21 . 


1548 


100 


852 


Z22968 


Homo sapiens 


M130 antigen 


6205 


100 


853 


Z22971 


Homo sapiens 


M130 antigen extracellular 
variant 


6380 


100 


854 


G03362 


— — 1 — — 

Homo sapiens 


Human secreted protein, SEQ 
ID NO: 7443. 


330 j 


96 


r* 

855 


G03362 


Homo sapiens 


Human secreted protein, SEQ 
ID NO : 7443 . 


203 


100 


n i~ *T 

656 


AF285118 


Homo sapiens 


CGI -2 03 




1UU 


857 


AC006069 


Arabidopsis 
thaliana 


putative cleavage and 
polyadenylation specif ity 
factor 


i o o "5 
13 o 3 


C C 


O JO 




Unmn can ~i otic 


V—_y i_ Ul_ I IX VJI lit- \_ WAlUdbC 

Polypeptide Via -liver 
precursor (EC 1.9.3.1) 


593 


100 


859 


L02956 


Xenopus 
laevis 


ribonucleoprotein 


1664 


85 


860 


AF201947 


Homo sapiens 


MEK binding partner 1 


616 


100 


' 861 


1*31783 


Mus musculus 


uridine kinase 


1266 


92 


862 


AF161472 


Homo sapiens 


HSPC123 


602 


73 


863 


Z49068 


Caenorhabdi t 
is elegans 


mitochondrial carrier protein 


370 


43 


864 


AF154108 


Homo sapiens 


tumor necrosis factor type i 


3559 


99 
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TABLE 2 



SEO 
ID 
NO: 


ACCF«ifi TOT* 
NUMBER 




— >n»e C^O T DTTrtM 
J&oLK 1 ir 1 1UW 


i>Mi IH- 
SCORE 


% 

-LL) ENTITY 














865 


AE001S30 


Helicobacter 
pylori J99 




i o "\ n 


J2 


866 


X57807 


Homo sapiens 


immunoQlobulin lambda liaht 
chain 


w J J 




867 


AL031673 


Homo sapiens 


dJ694Bl4.1 (PUTATIVE novel 
KRAB box r> rote in with 18 C2FT2 
type Zinc fincrer domains) 


4066 


99 


868 


Y11652 


Homo sapiens 


phosphate cyclase 


238 


100 


869 


AF192968 


Homo sapiens 


high- glucose -regulated 
protein 8 


3041 


99 


870 


AB020648 


Homo sapiens 


KIAA0841 protein 


3237 


S9 


871 


AL031427 


Homo saoif*Tis 


•ATI 6 7 AT *5 "1 fnrtvpl T^yr^t- oi 


lOVD 


inn 


872 


AF1 51534 


Homo sapiens 


core histone macroH2A2.2 


1866 


100 


873 


Ab021331 


Homo sapiens 


dJ366N23.1 (putative C, 
eiegans uw^-i?3 (protein 1, 
^•^*or*j..j./ iriiVci protein/ 


1129 


100 


074 


«V-A. 9 w O 




propionyi- loa carooxyiase 


O C *t Q 


100 


875 


AL117334 


Homo sapiens 


dJ687Fll.l (novel protein 
tpart or translation ot cuNA 
DKFZP434N061, Em :AL1 10249 ) ) 


306 


100 


876 


X79489 


Saccharomyce 
s ccrevisiae 


E-925 protein 


446 


35 


Off 


vc -j no i 


riomo sapiens 


Human secreted protein clone 
dn834^_l protein sequence SEQ 
i Li wo : o . 


811 


100 


o /o 




jiottxj sapiens 


CHMP1 . 5 


957 


100 






bus scrora 


4 OS riDosomal protein Si 2 


687 


100 


880 


AF001317 


Saccharomyce 
s cerevisiae 


Soilp 


478 


28 


881 


Y87275 


Homo sapiens 

- 


Human signal peptide 
containing protein HSPP , -52 

A DU -L-L* -Ww* ' J£ . 


2547 


100 


882 


M14036 


Homo sapiens 


Cl-inhibitor 


598 


77 


883 


AB041261 


Homo sapiens 


calcium- independent 
pnospnox lpase 


2903 


100 


884 


AF020313 


Mus mus cuius 


proline -rich protein 4 8 


999 


84 


885 


Y10936 


Homo sapiens 


hypothetical protein 


1104 


99 




At V / 393 / 


Mus mu s cu lus 


myotubularin related protein 
1 


f> jX- g~ 

866 


36 


887 


Y57893 


Homo sapiens 


Human transmembrane protein 

H l iny £4 — 1 / . 


1099 


94 


1 888 


AI/L17635 


Homo sapiens 


hypothetical protein 


929 


99 


a a q 
a o y 


Ar J10 J17 


Homo sapiens 


facilitative glucose 
transporter ramiiy memoer 
GLUT 9 


2046 


99 


890 


Y36031 | 


Homo sapiens 


Extended human secreted 
protein sequence, SEQ ID NO. 

i J.D • 


583 


100 


891 


Y36031 


Homo sapiens 


Extended human secreted 
protein sequence, SEQ ID NO. 
416. 


192 


57 




nr a J / D -j J. 


nomo sap i ens 


uDiquitous c ropotnoaui in u— 

luKXl 




i r»ft 


893 


AF090929 


Homo sapiens 


PR00477p 


653 


99 


894 


AL031228 


Wnmn <r =>t-j i #»n<: 


BING4 (similar to S. 
cerevisiae YER082C, M. sexta 
MNG10 and C. elegans F28D1.1) 




i nn 


895 


AL031228 


Homo sapiens 


dJ1033B10.2 (WD4 0 protein 
BING4 (similar, to S . 
cerevisiae YER062C, M. sexta 
MNG10 and C. elegans F28D1.1) 


2825 


96 


896 


AF171102 


Homo sapiens 


retinal degeneration B beta 


1302 


95 


897 


AE003551 


Drosophila 
melanogaeter 


CGI 8 176 gene product 


633 


33 
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TABLE 2 



SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 
DEAD Box Protein 5 


WATERMAN 1 
2443 J 


IDENTITY 
100 


1 898 
899 


AJ237946 
Z97184 


Homo sapiens 
Homo sapiens 


KKE2 ____ 


C ~i A 1 


100 

JL V V 


900 
! 901 


Z97184 ■— " 
AJ245587 


Homo sapiens 
Homo sapiens 


KKE2 

Kruppel-type zinc finger 


"409 — j- 




1 902 
903 


AF091034 
R9S953 


Homo sapiens 
Homo sapiens 


GTP-binding protein RAB22A j_ 
Eukaryotic cell growtn 
inhibiting factor. 


1011 | 

A T A 


100 

7u 


( 904 


L04733 


Homo sapiens 


kinesin light chain j 




1~> 
1 £ 


905 


AE003540 


Drosophila 
melanogaster 


CG109B4 gene product 


446 

2993 I 


33 


I 906 


M55542 


Homo sapiens 


guanylate binding protein 
isoform I 




98 


907 


M55542 


Homo sapiens 


guanylate binding protein 
isoform I 


2901 f 


96 


908 


W84085 


Homo sapiens 


Human membrane fusion protein 
WDProl . 


1889 


100 


1 909 


AF168676 


Homo 
sapiens 


TNF intracellular domain- 
interacting protein 


647 } 


100 


910 


AB029150 


Homo sapiens 


KRAB zinc finger protein 
HFB101L 


2196 i 


100 


911 


G02871 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 6952. 


521 


100 


912 


G03162 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 7243. 


387 ' 


87 


j 913 


AJ243721 


Homo 
sapiens] 
>Y92508 
Y92508 13- 
APR-2000 06- 
OCT-1998 
Human OXRE- 
5 . [Homo 
sapiens 


dTDP-4-keto- 6 -deoxy-D- glucose 
4 -reductase 


1710 1 


100 


914 


U241S9 

- 


Caenorhabdit 
is elegans 


hypothetical protein 1207-1; 
Method : conceptual 
translation supplied by - 
authors 


244 

1 


41 


915 


Y02591 


Homo sapiens 


A human progesterone receptor 
complex p23-like protein. 


843 


99 


915 


AE000984 


Archaeoglobu 
s fulgidus 


dinitrogenase reductase 
activating glycohydrolase 
j (draG) 


171 


26 


I 9X3 


M23159 


Cricetus 
cricetus 


DHFR-coamplif ied protein 


163 


30 


I 919 


L12018 


Caenorhabdit 
is elegans 


putative 


1232 


j 41 


j 920 


AF102177 


Homo sapiens 


tumor antigen SI*P-8p 


1 "5 c t\ 


■ 


921 


AL096712 


Homo sapiens 


dJ744I24.2 (similar to a 
novel human gene mapping to 
Activator) 


1017 


78 
1 42 


922 


AL161495 


Arabidopsis 
thaliana 


putative WD- repeat protein 


t> D O 


1 is 


923 


AL161495 


Arabidopsis 
thaliana 


putative WD-repeat protein 


A A~% 


1 

1 m 


924 


U97001 


Caenorhabdi t 
is elegans 


similar to 

Schizosaccnaromyces pomoe 


OUD 




925 


X71978 


Mus mus cuius 


Fif 


1503 


I 95 


| 926 


K92288 


Droaophila 
tnelanogas t er 


beta- spectrin 


290 


51 


927 


Y27575 


Homo sapiens 


Human secreted protein 
encoded by gene No . 9 . 


1392 


100 


928 


Y22499 


Homo sapiens 


Human secreted protein 
sequence clone mh703 1 . 


2249 


100 


93 0 


AJ224326 


Homo sapiens 


r ibu 1 ose - 5 - phospha t e- 
epimerase 


912 


100 


931 


U28991 


Caenorhabdi t 


coded for by C. elegans cDNA 


660 


55 
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TABLE 2 



SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


TDFWTTTV 






is elegans 


cm21c7 






932 


AL080065 


Homo sapiens 


hypothetical protein 


210 


25 


933 


G01384 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 5965. 


767 


98 


934 


AJ276485 


Homo sapiens 


integral membrane transporter 
protein 


1200 


100 


935 


AL035681 


Homo sapiens 


dJ756G23.3 (novel protein 
similar to drosophila 
J transcriptional repressor) 


1142 


80 


936 


AB026808 


Mus musculus 


| synaptotagmin XI 


2142 


95 


937 


AB015345 


Homo sapiens 


J HRIHFB2216 


2601 99 


938 


X65724 


Homo sapiens 


j ORF2 


j 498 


100 


939 


W89024 


Homo sapiens 


| Polypeptide fragment encoded 
| by gene 156 . 


1487 


100 


940 


G04047 


Homo sapiens 


1 Human secreted orotein. SEO 
j ID NO: 8128. 


: 117 


100 


941 


AF094583 


Homo sapiens 


1 putative HIV-l infection 
1 related protein 


4 52 


100 


942 


AC024200 


Caen or habdi t 
is elegans 


j contains similarity to 
several zinc finger proteins 
but not to the zinc finger 

| domains 


350 


69 


943 


AF129756 


Homo sapiens 

* 


1 G5c 


273 


100 


944 


K23765 


Rat t us 
norvegicus 


a 1 pha - 1 r opomy o 3 i n 


133 


96 


945 


AC009917 


Arabidopsis 
t ha liana 


Contains similarity to 


583 


47 


946 


AF223468 


Homo sapiens 


AD021 protein 


551 


44 


947 


AF055473 


Homo sapiens 


GAGE -8 


273 


51 | 


948 


X75756 


Homo sapiens 


protein kinase C mu 


2019 


! 68 | 


949 


AF143 956 






2300 


93 


950 


Y36729 


Homo 


Human PGl protein sequence. 


1861 


99 


951 


W49041 


Homo Qanipnc 


nu^iioui iuw ucnaity xipupruuc j.ti 
hi ridi ncr Dirot*#» i ri riBP- 2 


232 


67 


952 


AB016881 


Arabidopsis i 
thaliana 


gene_id :MXC17 . 7- 


2 03 


46 


953 


Y01785 


Homo sapiens 


Human ubiqui tin -conjugating 
enzvoie >Y25341 Y25341 01-JUI.- " 
1999 12 -AUG- 199 8 Human NCE-2 
protein. 


36S 


lOO 


954 


AF14S615 


Drosophila 
nielanogaster 


BCDNA.GH03377 


823 


4 S 


955 


U09410 


Homo sapiens 1 


zinc finger protein ZNP131 


2483 


99 


956 


U09410 


Homo sapiens | 


zinc finger protein ZNF131 


18 53 


99 


957 


AF195623 


Homo sapiens | 


cholinephosphotransf erase 1 
alpha 


2126 


99 


958 


X94917 


Drosophila 1 
melanogaster | 


head-elevated expression in 
0.9 kb 


155 


32 


959 


U54807 


Rattus | 
norvegicus | 


GTP- binding protein 


1167 


97 


960 


AF058807 


Bos taurus | 


GTP -binding protein rah 


606 


97 


961 


G03244 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 7325. 


471 


100 


962 


AF078850 


Homo sapiens 


steroid dehydrogenase hpmolog | 583 


40 


963 


AP001754 


Homo sapiens 1 


transient receptor potential- \ 
related channel 7, a novel 
putative Ca2+ channel protein 


317 


30 


964 


AL03S419 


Homo sapiens 1 


d»J1100H13.I (putative novel 
protein) 


1129 


100 


965 


X61381 


Rattus | 
rattus J 


interferon -induced protein 


202 


46 


966 


D38169 


Homo I 
sapiens I 


inositol 1,4, 5-trisphosphate 
3 -kinase isoenzyme 


3278 


100 


967 


AL031432 


Homo 
sapiens 


CU465N24.2.1 (PUTATIVE novel 
protein) (isoform 1) 


893 


100 
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TABLE 2 



SEQ 
ID 

NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRI PTION 


SM1TH- 
VV/VJ. CKrUUi 
o wive* 




ly&r* nil 


968 


U79275 


Homo sapiens 


unknown 


611 


100 


969 


AJ011306 


Homo 
sapiens 


guanine nucleotide exchange 
r actor viong isoronnj 




QO 


970 


AF281134 


Homo sapiens 


exosome component Rrp4 6 


1186 


100 


971 


U53336 


Ca enorhabdi t 
is elegans 


weak similarity over a onorc 
region co myosin neavy cna^ii 






972 


AC018749 


Leishmania 
major 


L8840.12 


589 


53 


973 


AF188504 


Mus musculus 


LNV 


^ ^ i 


85 


974 


U25801 


Homo sapiens 


Taxi binding protein 


8S2 


98 


97S 


AF049523 


Homo sapiens 
1 


hunt ingt in- interacting 

protein HxPA/KBPil 


X J — V 




976 


AF161530 


Homo sapiens 


HSPC182 


1040 


100 


977 


G04020 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 8101 . 


626 


100 


978 


AP164797 


Homo sapiens 


ribosomal protein L17 isolog 


908 


100 


979 


U94991 

* 


Xenopus 
laevis 


transcription factor XLMOl 


795 


97 


980 


S73775 


Homo sapiens 


calmitine; calsequestrina 


2029 


1UU 


981 


Y94888 


Homo 
sapiens 


Human protein clone HP01462 . 


2501 


100 


982 


AJ243191 


Homo sapiens 


heat shock protein 


827 


96 


9sa 


X65020 


Bos taurus 


PSST subunit of the NADH: 
ubiquinone oxidoreductase 
complex 


964 


85 


984 


AJ249207 


Rhodoco c cus 
sp . AD45 


putative racemase 


351 


43 


985 


Z30093 


Homo sapiens 


basic transcription factor 2, 
35 kD subunit 


1576 


99 


986 


AB030835 


Homo sapiens 


contains two glutamine rich 
domains, three zinc- finger 
domains, and mat r in 3 
homologous domain 3 (MH3) 


4697 


99 


987 


AF227258 


Bos taurus 


RPGR-interacting protein- 1 


1262 


3 o 


988 


AL022238 


Homo sapiens 


dJ1042K10.2 (supported by 
GENS CAN, FGENES and GENEWISE) 


4048 


99 


989 


AL022238 


Homo sapiens 


dJ1042K10.2 {supported by 
GENS CAN, FGENES and GENEWISE; 


2321 


a o 


990 


AF161426 


Homo sapiens 


HSPC308 


448 




991 


AFI61426 


Homo sapiens 


HSPC308 


Jf A O 

44B 




992 


AP161426 


Homo sapiens 


HSPC308 


453 


92 


993 


AL.023859 


Schizosaccha 

romyces 

pombe 


t ma -splicing endonuc lease 
subunit 


172 




994 


AL049631 


Homo sapiens 


diX513M9 - 1 (novel Hbmeobox 
domain protein) 


241 


47 


995 


AC0C52S3 


Homo sapiens 


R2644S 1 


902 


100 


996 


AF265206 


Homo sapiens 


MOG1 isoform A 


974 


iUU 


997 


AJ248285 


Pyrococcus 
abyssi 


sarcosine oxidase, subunit 
beta (soxB) 


195 


28 


998 


AE0G3641 


Drosophila 
rr.e lanogas ter 


BG:DS00 94l.3 gene product 


Z 1 D 




! 999 


W69343 


Homo 
sapxens 


Secreted protein of clone 
CR930 1. 


1J4U 

• 


Q Q 


1000 


AY007135 


Homo sapiens 


similar to bovine ADP/ATP 
translocase Tl mRJMA with, 
GenBank Accession Number 
M24102.1 


1543 


100 


1001 


Y73381 


Homo sapiens 


HTRM clone 1877278 protein 
sequence . 


1668 


100 


1002 


AF208844 


Homo sapiens 


BM-002 


428 


100 


1003 


AE004 944 


Pseudomonas 
aeruginosa 


hypothetical protein 


134 


35 


1004 


AL031431 


Homo sapiens 


dJ462023-2 (novel protein) 


2058 


100 


1005 


S45367 


Canis 
familiar is 


centraccin 


1949 


10O 
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TABLE 2 



PCTAJS00/34263 



ID 

NO* 


NUMBER 


QPRPTES 
JttV. X x»«? 




SMITH- 
wni x ix 

WATERMAN 
SCORE 


9- 

9 

IDENTITY 


1006 


S45367 


Canis 

f atnil iaris 

X» IX »*» — *^*" 


centractin j 


1315 


98 


1007 

«W V V 9 


AB02215S 


Mus 

musculus 


chaberonin containina TCP - 1 
epsilon subunit 


264 9 


96 


1008 

■■fc w V W 


Y76332 


Homo sapiens 


Fragment of human secreted 
protein encoded by gene 38. 


1282 


97 


1009 


AB011414 


Homo sapiens 


Kruppel-type zinc finger 
protein 


1671 


58 


1010 


Z68218 


Caenorhabdit 
is elegans 


K01H12 .1 


269 


6 7 


X w X X 


AB011414 


Homo saoiens 


Krur>t>el-tvT5e zinc ^inaer ! 
protein 


1671 


58 


1 ulc 


71 4 n n n 




RINGl 


2017 


100 


1013 


G02841 ! 


Homo sapiens 


Human secreted protein, SEQ 

TT> WO- 692? 


332 


93 


1014 


AF145659 


Drosophila 


BcDNA . GH10 3 3 3 


1244 


52 


i n*i c 




TTrtryi y% n w— v -» ATI d 

nuuio sapACiu 


protein encoded by gene 65. 


v w — 


67 


lulb 


vft *} C Q 1 




complex p23~like protein. 


772 


97 


1U1 / 




Homo sapiens 


Unman DDrvi Ten / rTMOR TOl ami r»r» 
awiu ucvjUBUUc a&y XL' rtw • / *t . 




100 


1018 


X67250 


Rattus 
norvegicus 


n-chimaerin 


1710 


97 


1019 


AF183417 


Homo 
sapiens 


microtubule- associated 
proteins lA/iti iignt cnain j 


631 


100 


1020 


AF164795 


Homo sapiens 


6ex-reguiacea protein j anus -a 


CIA 


lu u 


1021 


AF1906^5 


Coturnix 
coturnix 


"^^i — ; ■ — ■ — — — 


oJo 


yo 


1022 


AL133363 


Arabidopsis 
thai i ana 


putative protein 




^ * 


1023 


AB034912 


Homo sapiens 


wu- repeat xiKe secjuence 




100 

A w w 


1024 


AY007091 


Homo sapiens 


similar to Homo sapiens 
mammal xan inositol 
hexakisphosphate kinase 2 

uuUin wil.ii uc 


2243 


100 


1025 


X69910 


Homo sapiens 


P63 protein 


2958 


99 


1026 


Tin 11 ^ 


Homo sapiens 


V r\\jtr y 


1657 


100 

-dk V W 


i ■"»*>*■* 
1027 


t»tja« m "a "3 


Haiocyntnia 
roretzi 


nir Jl X — X 


1048 


54 


1028 


AB032931 


Homo sapiens 


ubiqui tin- conjugating enzyme 


1045 


100 


1029 


G01797 


Homo sapiens 


Human secreted protein, SEQ 
ID NO* SB7S 


749 


98 


i nun 




ITnmri *3 t\ r\ "i »»n^ 


Human *z**c^t~f*t' f*d r>7rot pin. SEO 
ID NO: 5878 . 


749 


98 






H#*»ntrt caY*!*! pnc 
nuiLiu sa^icuo 


vaninl at* t ^<^i"l - "^ i ncr oirotein 
VPS29/PEP11 


960 


100 


1032 


AJ222968 


Mus musculus 


L-periaxin 


120 


30 


i nil 




CZ /*. V\ *I ^/^ca 3* f**r*V*»;a 

romyces 


D"MRS> - NAM7 H<=> 1 -Lease familv 
protein 


685 


31 


1034 


Y41519 


Homo sapiens 


Fragment of human secreted 
protein encoded by gene 75. 


1321 


1 99 


1035 


AJ2760D4 


Mus musculus 


Paxneb Diotein 


1709 


77 


1036 


AP025459 


Caenorhabdit 
is elegans 


H14A12.3 gene product 


190 


30 


1037 


U37251 


Homo sapiens 


Description: KRAB zinc finger 
protein; this is a splicing 
supplied by author 


196 


43 


1038 


W74580 


Homo 
sapiens 


Human membrane protein 
BA0306. 


1921 


97 


1039 


U88173 


Caenorhabd i t 
is elegans 


weak similarity to 
Arabidops i s tha 1 i ana 
ubiquitin-like protein 8 


331 


80 
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TABLE 2 



SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 


■* 

IDENTITY 


1040 


AF2S0204 


Homo sapiens 


blood group carrier molecule 
DOK1 


J. t> -> / 


QQ 


1041 


Y96730 


Homo 

• 

sapiens 


PR0539, a Costal -2 homologue. 


162 


22 


1042 


AF140683 


Mus musculus j 


F-box protein FWD2 


2397 


98 


1043 


AF151023 


Homo sapiens 


HSPC189 


1104 


100 


1044 


AF181631 


Drosophila 
melanogaster 


BCDNA.GH04929 


204 




1045 


Y77385 


Homo sapiens 


Human collect Jin amino acid 
sequence . 


1940 


100 


1046 


AJ243972 


Homo sapiens 


6-phosphogluconolactonase 


131/ 


1UU 


1047 


AB035863 


Homo sapiens 


ATP specific succinyl CoA 
synthetase beta subunit 
precursor 


2324 


~99 


1048 


AIi034550 


Homo sapiens 


dJ1184F4.2 (novel protein 
similar to nucleolar protein 
4 (N0I*4) (NOLP) ) 


981 


92 


1049 


AF163825 


Homo sapiens 


pre-B lymphocyte protein 3 


634 


100 


10S0 


AF201949 


Homo sapiens 


60S ribosomal protein 1»30 
isolog 


868 


100 


1051 


AF190624 


Mus musculus 


mdgl - 1 


236 


85 


1052 


AE003S29 


Drosophila 
melanogaster 


CG6151 gene product 


160 


44 


1053 


G01191 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 5272. 


646 


98 


1054 


A3bl62756 


Neisseria 
meningitidis 


Glu-tRNA(Gln) 

amido transferase subunit A 


682 


44 


1055 


AP181856 


3attu$ 
norvegicua 


tRNA eelenocysteme 
associated protein 


1525 


99 


1056 


U89649 


Chlamydomona 
s 

reinhardtii 


Mrl9,000 outer arm dynein 
light chain 


\ 244 


34 


1057 


AF159141 


Homo sapiens 


breast cancer metastasis- 
suppressor 1 


663 


53 


1058 


AF230929 


Homo 
sapiens 


keratinocyte annexin-like 
protein peraphaxin 


1710 


99 


j 1059 


AJ270952 


Homo sapiens 


putative membrane protein 


1363 


100 


| 1050 


AF224263 


Heterodontus 
franc isci 


HoxD8 


742 


83 


1061 


X63417 


Homo sapiens 


IRLB 


1037 


100 


1062 


AL079345 


S t rep t omy c es 
coelicolor 
A3 (2) 


hypothetical protein 

■ 


143 


27 


1063 


Y71112 


Homo sapiens 


Human Hydrolase protein-io 
(HYDRL-10) . 


2547 


100 


1064 


AF2 63 614 


Homo sapiens 


acetyl -CoA synthetase 


3493 


99 


1065 


Y13356 


Homo sapiens 


Amino acid sequence of 
protein PR0221. 


1363 


1UU 


1066 


AC006153 


Homo sapiens 


similar to Aquizex aeolicus 
GTP-bindmg protein; similar 
to AE000771 <PID:g2904292) 


f~ 

662 




1067 


Y18930 


Sulf olobus 
solfataricus 


hypothetical protein 




Z 7 


1068 


R65969 


Homo 

sapiens T98G 


Glioblastoma -derived 
polypeptide . 


a an 




± u Oif 




Homo sapiens 


Human secreted protein 
fragment 


863 


96 


1070 


AF177476 


Rattus 
norvegicus 


CDK5 activator-binding 
protein 


1995 


86 


1071 


AF245505 


Homo sapiens 


adlican 


3109 


99 


1072 


U92794 


Mus musculus 


alpha glucosidase II, beta 
subunit 


147 


36 


1073 


G03889 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 7970. 


698 


98 


1074 


U15779 


Homo sapiens 


p70 


380 


28 


1075 


Y13392 


Homo sapiens | Amino acid sequence of 


1271 


91 j 
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TABLE 2 



SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


% 

IDENTITY 








protein PR0328. 






1076 


AF1614 57 


Homo sapiens 


HSPC339 


571 


100 


1077 


Y79509 


Homo sapien.3 


Human carbohydrate- associated 
protein CRBAP-5. 


2151 


98 


1078 


AF223466 


Homo sapiens 


HT015 protein 


831 


66 


1079 


AL13296S 


Arabidopsis 
thaliana 


putative WD- 4 0 repeat-protein 


286 


29 


1080 


AB024937 


Homo sapiens 


LUNX 


1284 


100 


1081 


Y14768 


Homo sapiens 


V-ATPase G-subunit liJce 
protein 


579 


100 


1032 


AF016416 


Caenorhabdi t 
is elegans 


F29A7.4 gene product 


141 


31 


1083 


L13291 


Homo sapiens 


ADP-ribosylarginine hydrolase 


802 


45 


1084 


AB041541 


Mus mus cuius 


unnamed protein product 


151 


44 


1085 

W V-^ -mj 


G01922 


Homo saDiens 


Human secreted protein, SEQ 
ID NO: 6003 . 


202 


97 


1086 

W W W 


AB030914 


Homo saDiens 


H-REV107 protein homo log 


833 


100 


1087 


AF151638 


Homo sapiens 


phosphatidylcholine transfer 
d rote in 


1142 


100 


1088 


Y84432 


Homo sapiens 


Amino acid sequence of a 
human SNA— associated 
protein. 


2783 


100 


1089 


Y94867 


Homo 
sapiens 


Human protein clone HP10563 . 


613 


100 


1090 


AK023982 


Homo sapiens ! 


unnamed protein product 


130 


49 


1091 


AB041586 


Mus mus cuius 


unnamed protein product 


1103 


81 


1092 


Y71277 


Homo sapiens 


Human 6iipoj protein. 


wvw 


100 


1093 


Li34973 


Mus musculua 


pzrocG in cyrosine pnospnavaoc 
like 


1131 


95 


1094 


Y66677 


Homo 
Sapiens 


neroDrane - uouna protein 


522 


£6 


1095 


Y87276 


riomo sapiens 


n Juldtl i>lyllax pepu J.UC 

pr.nt'A'irii ncr T5»*otftin HSPP-53 
SEO ID NO - 53 


1029 


99 




X O * ¥i ft) 




Human sicrnal Dentide 
containing protein HSPP-53 
S3Q ID NO : 53 . 


863 


98 


1097 


AF161455 


Homo sapiens 


HSPC337 


742 


98 


1098 


U80029 


Caenorhabdi t 
■i «= pleaana 


similar to thiorcdoxin 


242 


39 


1099 


AJ005066 


Homo sapiens 


Sqv-7-like protein 


1321 


99 


1100 


ACT00586S 


Homo sapiens 


Sqv-7-liJce protein 


1118 


99 


HOI 








891 


99 






nvuiv oa £j -K viio 


Sov-7-liJce orotexn 


1016 


99 


1103 


AL110244 


Homo sapiens 


hypothetical protein 


299 


31 


1 JLU4 




i/rosopnii d 
melanogaster 




147 


52 


11U3 


HJuUj x \J x u 




dJ42 2F24 1 (PUTATIVE novel 
protein similar to C. elegans 
C02C2 .5) 


968 


100 


1106 


U28016 


Mus musculus 


para t hi on hydrolase 
(phosohotri esterase) -related 
protein 


1624 


87 


1107 


AJ278150 


Homo sapiens 


putative lipid kinase 


2207 


99 




^> u --> / _> -> 


Homo qaoiena 


Human secreted protein, SEQ 
ID NO: 7814, 


495 


98 


1109 


AF217287 


Drosophlla 
melanogaster 


G protein RhoBTB 


834 


54 


1110 


Y28921 


Homo 
sapiens 


Human regulatory protein 
HRGP-7. 


941 


48 


1111 


Y28921 


Homo 
sapiens 


Human regulatory protein 
HRGP-7. 


1331 


51 


1112 


AF176704 


Homo sapiens 


F-box protein FBX9 


2027 


99 


1113 


AF182076 


Homo 
sapiens 


glioma tumor suppressor 1 2418 
candidate region protein 2 | 


100 


1114 


G04039 


Homo sapiens 


Human secreted protein, SEQ j 4 75 


96 
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TABLE 2 



SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 


% 

IDENTITY 








ID NO: 8120. 






1115 


AF229439 


Mus nrusculus 


zinc finger protein 289 






1116 


L40357 


Homo sapiens 


thyroid receptor interactor 


509 


100 


1117 


L40357 


Homo sapiens 


thyroid receptor interactor 


A A A 


ft ^ 


1118 


A12155 


Homo sapiens 


Human X5L cDNA . 


lb t 3 


t n o 
J.UU 


1119 


AL161542 


Arabidopsis 
thaliana 


isomerase like protein 


607 


53 


1120 


AL023754 


Homo sapiens 


dJ272L16.1 (Rat 

Ca2+/ Calmodulin aepenaenc 

Protein Kinase ijiax. protein; 


■£ J> •* 1 


QO 
-7 D 


1121 


YS7901 


Homo sapiens 


Human transmembrane protein 

ETMPN-25. i 


321 


36 


1122 


214122 


Xenopus 
laevis 


XLCL2 


455 


77 


| 1123 


AF22541B 


Homo sapiens 


lipase 


li>Jl 


y / 


1124 


Y06518 


Homo sapiens 


Zen GTPase interacting 
protein ZIP. 


•-ton 


10 O 


1125 


A1.035690 


Homo sapiens 


d»T202I2l.l (novel protein) 


952 


1 A A 

100 


1126 


AJ000217 


Homo sapiens 


CIiIC2 


1286 


99 


1127 


AB0305O5 


Mus musculus 


UBE-lc2 


1069 


79 


1128 


Y73375 


Homo sapiens 


HTRM clone 14 27838 protein 
sequence . 


874 


100 


1129 

■ 


Y78941 


Homo sapiens 


Cyclophilin- type peptidyl 
prolyl cis/ trans isomerase 
amino acid sequence. 


877 


100 


1130 


AL023553 


Homo sapiens 


dOT347H13.4 (novel protein J 


557 


100 


1131 


Y91945 


Homo sapiens 


Human chape rone protein 6 
(HCHP-6) . 


1408 


100 


1132 


Z68197 


Schizosaccha 

romyces 

pombe 


putative nuclear pore protein 


596 


39 


1133 


Z68197 


Schizosaccha 

romyces 

pombe 


putative nuclear pore protein 


389 


35 


1134 


AF180681 


Homo sapiens 


guanine nucleotide exchange 
factor 


3597 


100 


1135 


AF079765 


Mua musculus 


enhancer of polycomb 


264 


41 


1136 


M62419 


Mus musculus 


clathrin-associated protein 


2189 


99 


1137 


AJ006219 


Drosophila 
melanog aster 


clathrin- associated protein 


1254 


78 


1138 


Y76218 


Homo sapiens 


Human secreted protein 
encoded by gene 95 . 


440 


98 


1139 


W86104 


Homo 
sapiens 


A Rab protein designated 
HRABS-2 . 


1065 


99 f 


1140 


Y13401 


Homo sapiens 


Amino acid sequence of 
protein PR0339. 


3979 


98 


1141 


W85026 


Chimeric - 
Homo sapiens 


Green f luorescent protein- 
Zap70 fusion product. 


3309 


100 


1142 


Y13402 


Homo sapiens 


Amino acid sequence of 
protein pro31u . 


1694 


99 


1143 


G03875 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 7956 . 


660 


99 


1144 


Y12917 


Homo sapiens 


Amino acid sequence of a 
human secreted peptide. 


750 


98 


114S 


Y12917 


Homo sapiens 


Amino acid sequence of a 

nuIUCill SCCiCLCU pepU-lUCi 


1096 


100 


1146 


AL022157 


Homo sapiens 


SPIN (SPINDLIN HOMOLOG 
(PROTEIN DXF34) ) 


1233 


100 


1147 


AL022157 


Homo sapiens 


SPIN (SPINDLIN HOMOLOG 
(PROTEIN DXF34 ) ) 


1233 


100 


1148 


602548 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 6629. 


3 70 


93 


1149 


Y7333 8 


Homo sapiens 


HTRM clone 2019742 protein 
sequence . 


1492 


100 


1150 


W74841 


Homo sapiens 


Human secreted protein 
encoded by gene 113 clone 


228 


55 
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TABLE 2 



SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


% 

IDENTITY 








HEAAR60 . 






1151 


AF044201 


Rattus 
norveg 1 cus 


neural membrane protein 35; 

NMP3 5 


1570 


92 


1152 


AF156774 


Homo 
sapiens 


lysophosphatidic acid 
a cyl trans t erase- gamma i 


185S 


99 


1153 


AL118501 


Homo sapiens 


uuiisiNib.l {A novel protein 

\ ClalUiiuLlUll tJL. Clie CLUtn 

DKFZp56SA0946, Etn :AL050069) ) 


1 Q "7 ~i 




1154 


AF 131852 


xiomo sapiens 


UQKnovn 




inn 


\ i155 


Y41705 


Homo 
sapiens 


Human PR03S2 protein 
sequence. 


1381 


97 


1156 


G04036 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 8117. 


607 


99 


! H 57 


AF112444 


Lupinus 
luteus 


L»- asparaginase 


287 


43 


1158 


AF15184B 


Homo sapiens 


CGI -90 protein 


232 


32 


1159 


AJ272267 


Homo sapiens 


cholxne dehydrogenase 


2449 


100 


1160 


AB001773 


Ciona 
savignyi 


PEM-6 


196 


33 


1161 


Y87330 


Homo sapiens 


Human signal peptide 
containing protein HSPP-10 7 
SEQ ID NO: 107. 


746 


83 


1162 


Y87330 


Homo sapiens 


Human signal peptide 
containing protein HSPP-107 
SEQ ID NO: 107. 


746 


83 


1163 


AF113534 


Homo sapiens 


HP1-BP74 protein 


2723 


96 


1164 


AF232226 


Danio rerio 


Deddl 


191 


41 


1165 


AL118501 


Homo sapiens 


dJH91N16.1 (A novel protein 
(translation of the cDNA 
DKF2p566A0946, Em: AL050069) ) 


1051 


71 


1166 


AL118501 


Homo sapiens 


dJH9lNl6.l (A novel protein 
(translation of the cDNA 
DXFZp566A0946, Em: AL050069 ) > 


945 


75 


1167 


AP187733 


Homo sapiens 


syntaphilin 


831 


42 


1168 


AB019435 


Homo sapiens 


phosphol ipase 


951 


55 


1169 


AF064604 


Homo sapiens 


KE03 protein 


324 


33 


1170 


Y01164 


Homo sapiens 


Polypeptide fragment encoded 
by gene 6. 


1191 


100 


1171 


L03188 


Sac char omyce 
s cerevisiae 


putative 

• 


180 


22 


1172 


AF113751 


Mus musculus 


nuclear pore membrane 
glycoprotein POM210 


3941 


81 


1173 


AJ245417 


Homo sapiens 


G5b protein 


794 


100 


1174 


AL022238 


Homo sapiens 


dJ1042K10.3 (novel protein) 


1285 


100 


1175 


041278 


Caenorhabdit 
is elegans 


F33G12.3 gene product 


332 


2 8 


1176 


M35617 


Homo sap a ens 


T-cell receptor V- alpha -»J- 
alpha region 


284 


83 


1177 


AC01268O 


Arabidopsis 
thai i ana 


putative protein phosphatase 
2C; 55455-56414 


209 


37 


1178 


G01345 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 5426. 


692 


99 


1179 


AL096767 


Homo sapiens 


dJ579N16.3 (novel protein 
similar to worm, Arabidopsis 
and pine proteins) 


1342 


100 


1180 


AF039716 


Caenorhabd i t 
is elegans 


similar to ATP synthase B 
chain 


496 


55 


1181 


Y11710 


Homo sapiens 


collagen type XIV 


1048 


97 


1182 


X82240 


Homo 
sapiens] 
>R94974 
R94974 09- 
MAY-1996 27- 
OCT- 1994 
Human TCL- 1 
polypeptide . 


T cell leukemia/lymphoma 1 


617 


100 
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TABLE 2 



SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


V 

IDENTITY 






(Homo 
sapiens 








1183 


U42841 


Caenor habd i t 
is elegans 


short region of weak 
similarity to collagen 


161 


33 


1185 


AJ131613 


Homo sapiens 


dicarboxylate carrier protein 


1470 


39 


1186 


L27645 


Danio rerio 


growth- associated protein 


.130 


36 


1187 


Y02738 


Homo sapiens 


Human secreted protein 
encoded by gene 89 clone 
HLHFP03 . 


636 


100 


1188 


AF217544 


Xenopus 
laevie 


ornithine decarboxylase- 2 


1459 


60 


1189 


AL136307 


Homo sapiens 


dJ380B8 .2 (Neuritin, a 
protein which promotes 
neurite outgrowth) 


182 


33 


1190 


X89602 


Homo sapiens 


rTSbeta 


197 ] 


100 


1191 


U32828 


Haemophilus 

influenzae 

Rd 


ribosomal protein S6 
modification protein (rimK) 


268 


31 


1192 


AF154831 


Rattus 
norvegicus 


PV-l 


1403 


60 


1193 


Y50926 


Homo sapiens 


Human fetal brain cDNA clone 
vcl6 1 derived protein. 


918 


100 


1194 


AF026S30 


Rattus 
norvegicus 


stathmin- like -protein splice 
variant RB3 • ■ 


1093 


97 


1195 


U35244 


Rattus 
norvegicus 


vacuolar protein sorting 
homo log r-vps33a 


2981 


96 


1196 


Y70470 


Homo sapiens 


Human p53 target molecule, 
PRG3 protein. 


1680 


100 


1197 


AF157318 


Homo sapiens 


AD- 017 protein 


912 


47 


1198 

• 


AF125443 


Caenorhabcllt 
is elegans 


contains similarity to S. 
pombe phosphatidyl synthase 
(GB:Z28295) 


460 


39 


1199 


AF201934 


Homo sapiens 


DC12 


1649 


88 


1200 


AL031775 


Homo sapiens 


dJ30M3 . 3 (novel protein 
similar to C. elegans 
Y63D3A.4) 


1902 


100 


1201 


M21103 


Ovis aries 


BIIIB4 high-sulfur keratin 


484 


82 


1202 


ZB5986 


Homo sapiens 


dJ108Kll.3 (similar to yeast 
suppressor protein SRP40) 


1143 


75 


1203 


U18762 


Rattus 
norvegicus 


retinol dehydrogenase type I 


890 


52 


1204 


U35730 


Mus mus cuius 


jerky 


2235 


76" 


1205 


ABO02327 


Homo sapiens 


KIAA0329 


151 


24 


1206 


AB019233 


Arabidopsie 
thaliana 


ubiquinone/ menaquinone 

biosynthesis 

methyl transferase -like 


762 


56 


1207 


AL136307 


Homo sapiens 


dCT380B8.2 (Neuritin, a 
protein which promotes 
neurite outgrowth) 


742 


100 


1208 


AF207989 


Homo sapiens 


orphan G-protein coupled 
receptor 


2326 


100 


1209 


Z97630 


Homo sapiens 


dJ4 66N1.4 (novel protein 
similar to ANK3 (ankyrin 3, 
node of Ranvier (ankyrin 
G) ) ) 


181 


44 


1210 


riot c a a 


ITU 3 mUoUUlua 


3\ Q /t\Vi \j^f%viY\ "i 1 i T\ 


1280 


68 


1211 


Y27700 


Homo sapiens 


Human secreted protein 
encoded by gene No. 12. 


1267 


100 


1212 


AF117814- 


Mus mus cuius 


odd- skipped related 1 protein 


945 


66 


1213 


AF277233 


Naegleria 
f owleri 


calcineurin B 


222 


39 


1214 


D14849 


Mus musculus 


raelosis- specif ic nuclear 
structural protein 1 


1950 


77 


1215 


G03022 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 7103 . 


590 


100 


1216 


Z72510 


Caenorhabdit 


similarity to yeast UTR3 


634 


49 
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SEQ 
ID 
NO: 


ACCESS I ON 
NUMBER 




DESCRIPTION 


SMITH - 
WATERMAN 
SCORE 


IDENTITY 






is elegans 


protein (Swiss Prot accession 
yKo / /nij. - .3 comes rrom cms 
gene 






1 JIT 


Z4 


s cerevisiae 




~ 1A 
J. J* 


X X 


X ^- _L O 




| thaliana 


F9 "1 P 


1 Q Q 

12? 17 


1 "> Q 


1?19 








1 U^D 


/ 1 


1220 


Z70750 


Caenorhabdi t 
la cj-cycaiia 


similar- to vanadate 
fssistaucc protein 

v.i aii^tticiuijiaiiuuo t-w/iivc^ i. milt 


965 


58 


±.£.4 JL 


n.T.i c** pit ^ 


niVaJO JL, vUJ , £J& J. b 
t" h» 1 i ana 


jjui-dLive protein 


ez c. i 


D 1 


1222 


AF155100 


Homo sapiens 


zinc finger protein NY-REN- 21 

Oil U X^jCU 


2261 


100 








o j. jr — jjxiiQ.j-iiy regulatory 


■JCC 
J30 


■ nn 

JLUU 


1224 


Y73364 


Homo saoi pun 


HTRM f-1 nnp 27fi*;<)<)l nrnf p i n 
semipnce 




99 


1225 


AX>050170 


Homo sapiens 


hypothetical protein 


714 


100 








PAP74 


« O O X 


9 9 

j> j* 


1227 


X04085 


Homo sapiens 


catalase 


2846 


100 


1228 


AOT005620 


Mus musculus 


skeletal muscle-specific gene 


1416 


90 


1229 


AF045564 


Rattus 
norvegicus 


development -related protein 


1715 


93 


1230 


X97571 


Mus musculus 


HCMV- interacting protein 


479 


96 


1231 


L08239 


Homo sapiens 


located at 0ATL1 


2274 


100 


1232 


AF121863 


Homo sapiens 


sorting nexin 14 


1964 


100 


1233 


AF121863 


Homo sapiens 


sorting nexin 14 


1203 


84 


1234 


AC024805 


Caenorhabd i t 
is elegans 


contains similarity to 
TR:O04595 


744 


31 


1235 


AC006634 


Caenorhabdit 
is elegans 


contains similarity to 
Saccharotnyces cerevisiae 
probable membrane protein 
YIiR418C \GB:U20162) 


357 


33 


1236 


Y 18 101 


Mus musculus 


macrophage actin-associated- 
tyros ine -phosphoryl ated 
protein 


1559 


87 


1237 


Aj3042o4 d 


Homo sapiens 


iKjrIJL72 


1.2.2. <t 


t An 

IVU 


1238 


AB026264 


Homo sapiens 


IMPACT 


1694 


100 


1239 


AB026264 


Homo sapiens 


IMPACT 


1123 


inn 

J.UU 


1240 


G00429 


Homo sapiens 


Human secreted protein, SEQ 


/i 

324 


- inn 
1UU 


1241 


Y76144 


Homo sapiens 


Human secreted protein 
encoaea Dy gene «i • 


1363 


53 


1242 


AL035602 


Arabidopsis 
thaliana 


putative protein 


499 


28 


1243 


X76483 


Gallus 
gallus 


Yes-associated protein 
(obKDa) 


574 


48 


1244 


AF220186 


Homo sapiens 


uncharacterized hypothalamus 
protein niUl<J 


503 


10 0 


1245 


XL021453 


Homo sapiens 


dJ821D11.3 (PUTATIVE protein) 


856 


100 


1246 


A»J276O03 


Homo sapiens 


GARl protein 


1216 


1UU 


1247 


Y57910 


Homo sapiens 


Human transmembrane protein 


1369 


98 


1248 


AC004874 


Homo sapiens 


similar to N- 

a ce ty 1'gal ac t os aminy 1 1 rans f er a 
se; similar to Q07537 
(PID:gll71989) 


9S7 


100 


1249 


AP199597 


Homo 
sapiens 


A- type potassium channel 
modulatory protein l 


113 9 


1O0 


1250 


Y13148 


Rattus 
norvegicus 


PAG608 


1350 


88 


1251 


M24852 


Rattus 
norvegicus 


neuron- specif ic protein PEP- 
19 


124 


46 
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SBQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 

oUJKfc 


IDENTITY 


1252 


AF146738 


Rattus 

no rveg i cus 


testis specific protein 


771 


83 


1253 


G02725 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 68C6 . 


413 


97 


1254 


W44375 


Homo sapiens 


Human uDiquitin-conjuyacmg 
enzyme polypeptide. 




Q Q 


1255 


AC006538 


Homo sapiens 


J. 1 y r> J. 


Oil 
OO 1 


1 P 
/ o 


1256 


AB004316 


Bos taurus 


mitochondrial methionyl-tRNA 
transformylase 


1556 


88 j 

• 


1257 


Z35094 


Homo sapiens 


SURF- 2 


1354 


97 


1258 


Y13362 


Homo sapiens 


Amino acid sequence of 
protein PR0214 . 


2383 


100 


1259 


AC006014 


Homo sapiens 


similar to RFP transforming 
protein,- similar to P14373 
(PID:gl32517) 


1299 


100 


1260 


AC005099 


Homo sapiens 


match to AI222572 
(NID:g3804775) 


469 


100 


1261 


V00507 


Homo sapiens 


coding sequence of DHFR (1 is 
1st base in codon) (5 SI is 
3rd base m codon) 


984 


100 


1262 


X15443 


Rattus sp. 


gamma -glutamyl transpeptidase 
(AA 1-568) 


697 


32 


1263 


AF173871 


Mus musculus 


neuronal PAS 3 


977 


94 


1264 


AF178983 


Homo sapiens 


Ras-associated protein Rapl 


433 


97 


1265 


Y70473 


Homo sap x ens 


Human cyclic nucleotide- 
associated protein- 1 (CNAP- 
1) . 


2785 


99 


1266 


Y41738 


Homo 
sapiens 


Human PROS41 protein 
sequence . 


1622 


1C0 


1267 


AF061346 


Mus muoculus 


Edpl protein 


1077 


64 


1268 


U97006 


Ca enorhabdi t 
is elegans 


C13F10.4 gene product 


154 


23 


1269 


AF233582 


Mus musculus 


GTPase Rao37 


942 


95 


1270 


AF195951 


Homo sapiens 


signal recognition particle 
68 


3127 


98 


1271 


AL031177 


Homo sapiens 


dCTB89M15-3 (novel protein) 


1150 


55 


1272 


AP201933 


Homo sapiens 


DC11 


650 


100 


1273 


AF201933 


Homo sapiens 


DC11 


346 


98 


1274 


AL021710 


Axabidops is 
t ha liana 


putative protein 


348 


43 


1275 


AC004449 


Homo sapiens 


R3 3 683_3 


c c c 




1276 


Y86295 


Homo sapiens 


Human secreted protein 

Hli2AG87 , SEQ ID ND:/10. 


1920 


100 


1277 


Y71111 


Homo sapiens 


Human Hydrolase protein- 9 
(HYDRL-9) . 


1 ETC 

lb /o 




1278 


S94421 


Homo s ap i ens 


t ceil receptor eca-exon 


n / D 


t no 


1279 


j Y66695 


Homo 
sapiens 


Tiff j— i m Vt. ^% w% V m #^ ^^^H 4 

wemorane -oouiia protein 
PR01344 . 


1 QHQ 


i no 


1260 


AF1613 BO 


Homo sapiens 




f / A 


1 OO 

J.VU 


1281 


Y48610 


Homo sapiens 


Human breast tumour- 
associated protein /l. 


779 


100 


1282 


AC015446 


Arabidcpsis 
tnaiiana 


Similar to aigi protein 


406 


35 


1283 


AK024432 


Komo sapiens 


FLJ0O022 protein 


4 03 


35 


1284 


W96153 


Homo sapiens 


Human FADD- interacting 
nrotein (FIP) 


1823 


ol 


1285 


A*3u01019 


Homo sapiens 


ring finger protein 


1301 


100 


1286 


AE0C3823 


Drosophila 
me 1 anogas ter 


CG13178 gene product 


195 


29 


1287 


AP178632 


Homo sapiens 


FEM-l-like death receptor 
binding protein 


3261 


100 


1288 


AC006033 


Homo 
sapiens 


similar to MIiN 64; similar to 
138027 (PID:g2I35214) 


1195 


100 


1289 


ACO06O33 


Homo 
sapiens 


similar to MLN 64; similar to 
138027 (PlD:g2135214) 


668 


93 


1290 


AB023811 


Homo sapiens 


TU3A 


351 


54 



176 



m.wi?Ai I > 



WO 01/53312 



PCT/USOO/34263 



TABLE 2 



curt 

OCry 

ID 

NO: 


NUMBEP 


oJrr*' — lh<o 


L>r^bCK i rL X UW 


SCORE 




1291 


Z73424 


Caenorhabdi t 
is elegans 


C44B9.1 


235 


36 


1292 


Y94871 


Homo 
sapiens 


Human protein clone HP02551. 


1222 


100 


1293 


AF130425 


Homo sapiens 


retinoblastoma-associated 
protein RAP140 


489 


29 


1294 


G03856 


Homo sapiens 


Human secreted Drotein. SEO 
ID NO: 7937 . 


538 


99 

«— * MT 


1295 


AF133670 


Mus rousculus 


ARIi-6 interacting protein- 2 


367 


51 


1296 


1 AJ249735 


Homo saDiens 


claudin- 6 


1142 


100 

mJJV w W 


1297 


X57560 


Escherichia 
coli 


ncnE nrotein 


535 

MM# V **" 


100 

JL. V W 








Din Cli-LU Lyo tCiliC 1 1^11 ViV(UCI-Lil0 

protein 1 


199*7 




*~ j -j 


U41023 

w ^ 1 U 4. ^ 


Pa pnnrhahri i t" 

JL *J w JL w y v*-* 1^ 


ykl09h8.5 


324 






AO \J ^ *m »j> -3 






1206 

-mL £» V/ U 


10 0 

Iv V 


1301 


X5S989 


Homo sapiens 


eosinophil cationic- related 
protein 


737 


99 


1302 


AF007151 


Homo sapiens 


unknown 


1481 


100 


1303 


A529Q4 


¥.» ^M ^M a ^ u» Mm, ««V\ ™M "*k 

■Escneri cnia 

COll 


open reading rrame iaa i-fab^ 


3 ->1* 


lvU 


13 U4 


U19b77 


bscnericnia 
coli 


! gainctonace acnyaracase 




?J 


13 0b 


AF26 6508 


K — M«M*-«, M ^Mh *~— fc fc V fc _ JMj 

Mus mus cuius 


NEXir pro t e i n 


1 4 AQ 

1* Ui* 


17 / 


1306 


Y57901 


Homo sapiens 


Human transmembrane protein 

HTMPN-25 . 


932 


100 


1307 


U58750 


Caenorhabdi t 
io elegans 


similar to the mitochondrial 

• r - » ^ 

carrier tamiiy 


365 


54 


13 00 


Ar U 4 4 / 74 


Homo sapiens 


DreaKpoinc cluster region 
protein 2 


M»oa i 


qq 


i3oy 


ALiU / obl7-A 


xiomo sapiens 




M>0 / 




1310 


X82693 


Homo sapiens 


£48 antigen 


620 


96 


*1 ** *i 
1 J 11 


Zgi2oi 


-m. _Mh -mm. -fc W «aib *Im_, m£ m>m 

is elegans 


C4VA4 ♦ 1 


ZD J 




1312 


Ar 131218 


Homo sapiens 


chromosome 16 open reaamg 
frame 5 


±*± 


± u u 


13 13 


x41 / ©3 


Homo 
sapiens 


Human JVKuyjy procein 
sequence . 


lO JO 


i no 


T 1 *1 /I 

1314 


Ar"19t>972 


Homo sapi ens 


oM^4 protein 




i no 

Iwv 


13 lb 


Ar 053 356 


Homo sapiens 


«S I"* ■■ % -* «1 WM JMW MMB HmI |M> — M. -MM- n h M. *L*M --M ^M M-M, — — ^ 

insulin receptor suds r. race 
like protein 


mCmI o 


9 7 


1316 


Y66695 


Homo 
sapiens 


Membrane -bound protein 
PR01344 . 


1909 


100 


1317 


AF153127 


Gallus 
gallus 


SAPK. interacting protein 


2442 


89 


1318 


AF153127 


Gallus 
gallus 


SAPK interacting protein 


1477 


83 


1319 


AF153127 


Gallus 
gallus 


SAPK interacting protein 


1651 


86 


1320 


X56932 


Homo sapiens 


23 kD highly basic protein 




Ton 
lOU 


1321 


AF174605 


Homo 

sapiens] 

>Y83086 

MAR-2000 28- 
AUG-1998 F- 
box protein 
FBP-18. 
[Homo 
sapiens 


F-box protein Fbx25 j 

i 


467 

• 


70 


1322 


M61732 


Trypanosoma 
cruzi 


neuraminidase 


214 


24 


1323 


Y17013 


porcine 
endogenous 


pol 


304 


64 
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IDENTITY 






re trovi rus 








1324 


AL13B655 


Arabidopsis 
thaliana 


putative protein 


1174 


37 


1325 


AL138655 


Arabidopsis 
thaliana 


putative protein 


946 


35 


1326 


AL13 3215 


Homo sapiens 


bA108L7.2 (novel protein 
similar to rat tricarboxylate 
carrier) 


1322 


99 i 


1327 


AF161541 


Homo sapiens 


HSPC056 


1357 


99 


1328 


Y73346 


Homo sapiens 


HTRM clone 619699 protein 
sequence . 


785 


96 


1329 


L10910 


Homo sapiens 


splicing factor 


912 


82 j 


1330 


AF146568 


Homo sapiens 


Mini protein 


1936 


100 


1331 
1332 " 


W87772 


Homo sapiens 


Human serum glucocorticoid- 
regulated kinase (H-SGK2) 
polypeptide . 


232 


39 


Y41741 


Homo 
sapiens 


Human PRO704 protein 
sequence . 


1860 


100 


X -> J -J 


AF29S096 


Homo sapiens 


zinc -finger protein ZBRK1 


411 


91 


1334 


Z82271 


Caenorhabdi t 
is elecans 


Similarity to Mouse kinensin- 
like protein KIF4 comes from 
this gene 


578 


44 


JL J J 3 




1- hanobacte 
rium 

thermoautotr 
ophicura 


conserved orotein 


290 

• 


43 


1336 


Y68779 


Homo sapiens 


Amino acid sequence of a 
human phosphorylation 
effector PHSP-11. 


1019 


91 


1337 


AB027003 


Mus mus cuius 


protein phosphatase 


378 


84 


1338 


U64856 


Caenorhabdi t 
is elegans 


weak similarity to TPR 
domains 


| 215 


40 


1339 


AE001394 


Plasmodium 
falciparum 


protein of the YMR7 family 


170 


29 


1340 


X76717 


Homo sapiens 


MT-11 protein 


2 04 


89 


1341 


AC011914 


Arabidopsis 
thaliana 


putative mutT protein; 683 98- 
67881 


289 


45 


1342 


Aa276171 


Homo sapiens 


ASPIC 


2122 


100 


1343 


AF187016 


Homo sapiens 


myosin regulatory light chain 
interacting protein MIR 


2303 


99 


13 44 


AC006963 


Homo sapiens 


similar to Kelch proteins; 
similar to BAA77027 

(PID:g4650844) 


894 


35 


1345 


AF2S7466 


Homo sapiens 


N-acetylneuraminic acid 
phosphate synthase 


1880 


99 


1346 


Y25896 


Homo sapiens 

■ 


Human secreted protein 
fragment encoded from gene 
64. 


1148 


100 | 


1347 


AJ272073 


Torpedo 
marmorata 


male sterility protein 2- like 
protein 


1664 


58 


1348 


AF161548 


Homo sapiens 


HSPC063 


1018 


98 


1349 


W78128 


Homo sapiens 


Human secreted protein 
encoded by gene 3 clone 
HOSBI96 . 


j 1117 


100 


1351 


G02144 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 6225. 


418 


100 


1352 


D90869 


Escherichia 
coli 


similar to 


2047 


100 


1353 


A12029 


Homo sapiens 


MRP- 14 


613 


100 


1354 


AC00S328 


Homo sapiens 


R26660 1, partial CDS 


870 


74 


1355 


AC024876 


Caenorhabdi t 
is elegans 


contains similarity to 
SW:RPB1 CRIGR 


829 


61 


1356 


AF077226 


Homo sapiens 


copine III 


1876 


64 


1359 


AF2171B8 


Mus mus cuius 


YIP1B 


801 


63 


1360 


AC074331 


Homo sapiens 


2NF234 


3669 


100 


1361 


AL163279 


Homo sapiens 


homolog to cAMP response 


5035 


99 
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W A i. C.KMAW 

SCORE 


% 

IUENTITY 








element binding and beta 
transducin family proteins 






1362 


Z48475 


Homo sapiens 


glucokinase regulator 


3160 


99 


1363 


248475 


Homo sapiens 


glucokinase regulator" 


b t> Z 


i# / 


1364 


AF195764 


Homo sapiens j 


megakaryocyte- enhanced gene 
transcript 1 protein; MEGT1 
protein 




Q O 


1365 


AF116609 


Homo sapiens 


PRO0915 


581 


100 


1366 


AF116609 


Homo sapiens 


PRO0915 1 


581 


100 


1367 


AL1173S2 


Homo sapiens 


dJ876B10.3 (novel protein 
similar to C. elegans 
T19B10.6 (Tr:Q22557) ) 


2581 


99 


136B 


Y34124 


Homo 
sapiens 


Human potassium channel 
K+Hnovl5. 


1342 


1UU 


1369 


AJ245621 


Homo sapiens 


CTL2 protein 


J / ^ o 




1370 


AF008220 


Bacillus 
subtilis 


YtaG 


429 


45 


1371 


X05562 


Homo sapiens 


alpha- 2 chain precursor (AA - 
2S to 1018) (3416 is 2nd base 
in codon) 


fs f> A 

5908 


99 


1372 


Z98048 


Homo sapiens 


dJ408N23.4 (novel DnaJ domain 
protein) 


1296 


99 


1373 


AF154415 


Homo sapiens 


FLASH 


10253 


100 


1374 


U20286 


Rattus 
norvegicus 


lamina associated polypeptide 
1C 


1567 


69 


1375 


U53445 


Homo sapiens 


DOCl _J 


1645 


4 6 


1376 


AL117337 


Homo 
sapiens 


bA393J16.1 (zinc finger 
protein 33a (KOX 31)) 


250 


60 


13 77 


AC005328 


Homo sapiens 


R26660_l, partial CDS 


1126 


100 


1378 


U35113 


Homo sapiens 


metastasis -associated gene 


1823 


69 


1379 


L1S313 


CaenornaJbdit 
is elegans 


putative 


858 


58 


1380 


Y25756 


Homo sapiens 


Human secreted protein 
encoded from gene 4 6 - 


1508 


100 


1381 


AB037360 


Homo sapiens 


ANKHZN 


5734 


95 


1382 


AB037360 


Homo sapiens 


ANKHZN 


959 


97 


1383 


AF237676 


Mug musculus 


G beta- like protein GBI* 


1721 




| 1384 


AF237676 


Mus musculus 


G beta- like protein GBL» 


1043 


70 


1385 


Y58793 


Homo sapiens 


Human calcium regulatory 
protein CaREG-1 . 


715 


1LU 


13 86 


AF212162 


Homo sapiens 


nine in 


i r\ ~i c o 
103637 


OQ 


1387 


AL031685 


Homo sapiens 


dJ963K23.2 (novel protein) 


337 


33 


1388 


AC004890 


Homo sapiens 


similar to zinc finger 
proteins; similar to BAA24380 
>W06316 W06316 03-OCT-1996 
27-APR-1995 TRP - 1 protein. 


542 


DC 

oo 


1389 


AF187989 


Homo sapiens 


zinc finger protein 2NF223 


2665 


99 


1390 . 


AC035150 


Homo sapiens 


Zinc finger protein ZNF221 






1391 


AF2B7894 


Homo sapiens 


PIST 




97 


1392 


AF282265 


Homo sapiens 


inner centromere protein 
INCENP 


1794 


99 


1393 


X90840 


Homo sapiens 


axonal transporter of 
synaptic vesicles 


4584 


99 


13 94 


AF076249 


Homo sapiens 


zinc finger protein SBBIZl 




qq 


1395 


G02224 


Homo sapiens 


Human secreted protein, SEQ 
tt\ wr>. cine 


299 


75 


1396 


AC004809 


Arabidopsis 
thaliana 


Similar to 


130 


34 


1398 


AF242519 


Homo sapiens 


zinc finger protein SBZF3 


181 


66 


1399 


AL133396 


Homo 
sapiens 


dJl068H6.4 (prion protein 
like protein doppel) 


962 


100 


1400 


Y48611 


Homo sapiens 


Human breast tumour- 
associated protein 72 . 


817 


99 


1401 


AC004472 


Homo sapiens 


PI. 11659 5 


280 


54 


1402 


X91489 


Saccnaromyce 
s cerevisiae 


putative HMG box 


164 


27 
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ID 
NO: 


ACCESSION 
NUMBBR 


SPECIES 


DESCRIPTION 


SMITH- 


5 

TTYRNTTTV 

iUlUl I i A. X 


1403 


Y79222 


Homo 
sapiens 


Human transferase iruN£>r t> 14 . 




100 

-L V W 


1404 


X81058 


Mus raus cuius 


texzbi 


1010 

-1. V _L v 


99 


1405 


AB012084 


Mus musculus 


ITM ^ 


194 


29 


1406 


AB030251 


Homo sapiens 


GTPase activating protein 


J A J J 


99 


1407 I 


AJ010S85 


Rat t us 
rattus 


PTB-like protein j 


2684 


99 


140B 


X75760 


Drosophila 
aielanogaster 


LRR47 


364 


29 


1409 


U76618 


Mus musculus 


N-RAP \ 


ft Oil 


48 


1410 


AC005578 


Homo sapiens 


P20887 1, partial CDS 


835 


63 


1411 


AE000284 


Escherichia 
cola 


orf , hypothetical protean. 


OCA 
J D U 


i no 


1412 


X01563 


Escherichia 
coli 


L5 (rplE) [aa 1-179) 


"911 


100 


1413 


W78279 


Homo sapiens 


Fragment of human secreted 
protein encoded by gene 33. 


1264 


99 


1414 


AB031051 


Homo sapiens 


organic anion transporter 
OATP-E 


3832 


100 


1415 


M17466 


Homo sapiens 


coagulation factor XII 


3455 


100 


1416 


AF097994 


Homo 
sapiens 


L - Jcynuren i ne / a lpha - 
aminoadipate aminotransferase 


2202 


99 


1417 


AF151077 


Homo sapiens 


HSPC243 


1262 


99 


1418 


Y09945 


Rattus 
norvegicus 


putative integral membrane 
transport protein 


1098 


61 


1419 


U13152 


Mesocricetus 
auratus 


guanine nucleotide -binding 
protein beta 5 


2179 


76 


1420 

• 


AL1624 58 


Homo sapiens 


bA465L10.5 (KIAA1176 (novel 
protein, presumed ortholog 
of mouse K-Cl cotransporter 
KCC2) ) 


5696 


100 


1421 


Y99426 


Homo sapiens 


Human PRO1604 (UNQ785) amino 
acid sequence SEQ ID NO: 3 08. 


152 


29 


1422 


Y94923 


Homo sapiens 

■ 


Human secreted protein clone 
qsl4_3 protein sequence SEQ 
ID NO: 52. 


4039 


99 


1423 


AF177388 


Homo 
sapiens 


cancer- ampl i t ied 
transcriptional coactivator 
ASC-2 


10748 




1424 


Y4 8S17 


Homo sapiens 


Human breast tumour- 
associated protein 62. 


1DD1 




1425 


AF208848 


Homo sapiens 


BM- 006 


14 54 


89 


1426 


AF208848 


Homo sapiens 


BM-006 


ft 53 

O 3 -J 


79 


1427 


AF112886 


Bos taurue 


differentiation enhancing 
factor 1 


4693 


"95 


1428 


U41387 


Homo sapiens 


Gu protein 




63 


1429 


AF161534 


Homo sapiens 


HSPC049 


2853 


78 


1430 


AF125043 


Mus musculus 


oisphospnate 3 -nucieocioase 


^*75 


30 


1431 


Y66718 


Homo 

* 

sap a ens 


Membrane-bound protein 
PRO1106 . 


1886 


100 


1432 


AF193613 


Homo sapiens 


cell recognition moiecuie 
Caspr2 


effi R 

JO o 


100 


1433 


AB044560 


Mus musculus 


Gliacolin 


X J " 


34 


1434 


R99900 


Homo sapiens 


NTII-1 nerve protein, 
facilitates regeneration oi 
nerve cells. 


70 / 


31 


1435 


AF220530 


Homo sapiens 


myo- inositol 1-phosphate 
synthase Al 


2904 


100 


1436 


X70944 


Homo sapiens 


PTB-asaociated splicing 
factor 


1261 


72 


1437 


AF27173 2 


Homo sapiens 


bridging integrator- 3 


1282 


100 


1438 


Y30811 


Homo sapiens 


Human secreted protein 
encoded from gene 1 . 


595 


98 


1439 


AJ293659 


Homo sapiens 


mucol ipidin 


628 


97 


1440 


AF219138 


Homo sapiens 


GGA3 long isoforra 


3083 


100 


1441 


AF219138 


Homo sapiens 


GGA3 long isoform 


3346 


100 
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SEQ 
XD 

VTA 

NO : 


ACCESSION 
NUMBER 


SPECIES 


DESCR i PT ION 


SMITH- 
WATERMAN 


•< 

IDENTITY 


1442 


AB039669 


Homo sapiens 


7> • Vi" V "J 

/vLuSJU 


«12#**^ 


t nn 


1443 


AP237711 


Drosophila 
cue 1 anoga seer 


Diablo 

* 


191 


27 


1444 


AJ011896 


Homo sapiens 


JN a X J. DC la p iO C e ill 


A "K. Q 
^ J 27 




144b 


X 73874 


Homo s dip j. ens 






9 ft 


14 46 


Ar 2141 _4 


Homo sapiens 


antigen BCAA 


IQQQ 




1447 


AF003924 


Homo sapiens 


ANC_2H01 


2645 


99 


1448 


AF00313 6 


C&finOtilaOQl C 

is elegans 


concains weajc siiuiiarxcy to 
an AMP-binding motif 


*5 Jl A 1 
^ O ^ J 




1449 


AF155112 


Homo sapiens 


Wi- KiiN -bu antigen 


*1 T Oil 
JL 1 04 


ft Q 


1450 


Y95004 


Homo sapiens 


Human secreted protein 

VCt>4 1, oty iU WU:td . 


CDC 


JLUU 


1451 


AF107203 


Homo sapiens 


ataxin 2-binding protein 


688 


57 


1452 


AF107203 


Homo sapiens 


ataxin 2 -binding protein 


/ice 
4 56 




1453 


Z38011 


Mus musculus 


DMR-N9 


8 82 


56 


1454 


X90568 


Homo sapiens 


Protein sequence and 
annotation available soon via 
LABEIT©EMBL-Heidelberg . DE 


510 


28 


1455 


AL035409 


Homo sapiens 


da564M11.3 Isimilar to 
s i a lyl t ranf e ra se ) 


1356 


100 


1456 


D44480 


Mus musculus 


j MATH -2 protein 


272 


100 


1458 


AF141326 


Homo sapiens 


RNA helicase HDB/DICE1 


478 


45 


1459 


AF242552 


Gallus 
gallus 


retinovin 


945 


34 


1460 


U11036 


Homo sapiens 


Ibdl 


724 


84 


1461 


AB025258 


Mus musculus 


granuphiiin-a 


545 


39 


1462 


Y0B134 


Homo sapiens 


acid sphingomyelinase- like 
phosphodiesterase 


2428 


99 


1463 


AC004997 


Homo sapiens 


match to ESTs Z4397 9 
(NID:g573097J , R19699 
(NTD:g774333) 


869 


98 


1464 


AC004 997 


Homo sapiens 


match to ESTs 243979 
(NID:gS73097) , R19699 
(NlD:g774333) 


869 


98 


1465 


U32743 


Haeraophi 1 us 
influenzae . 
Rd 


! fucose operon protein (fucU) 


315 


50 


1466 


Y09022 


Homo sapiens 


Not56-liJce protein 


2342 


100 


1 1467 


AC003034 


Homo sapiens 


Homo log of rat kidney- 
specific (KS) gene 


1072 


99 


1468 


AF071544 


Spinacia 
oleracea 

1 


ribulose - 1 , 5 - bi sphospha t e 

_ mm M -| ^ 

carboxylase/oxygenase small 
subunit N-methyl transferase I 


333 


26 


1469 


Y57930 


Homo sapiens 


Human transmembrane protein 
HTMPN- 54 . 


1053 


inn 

xu u 


1470 


AF032666 


Rattus 
norvegicus 


rsec5 


4504 




1471 


Y70467 


Homo sapiens 


Human membrane channel 
protein-17 <MECHP-17) . 


4 Z>mt 




1472 


AL031033 


Homo sapiens 


C321D2 . 1 (Ribosoraal l»arge 
Subunit Pseudouridine 
Synthase protein) 


1694 


lOO 


1473 


AF177292 


Homo sapiens 


' " \ " • mm, 

genethonin 3 


4 026 


Q Q 


1474 


S45936 


Homo sapiens 


HTSl 


1101 


c n 


1475 


Y86241 


Homo sapiens 


Human secrecea protein 
HOABR60, SEQ ID NO: 156. 


1 H7Q 
x o / ^ 


98 


1476 


AJ010317 


Fugu 
rubripes 


Sand 


1 1278 


68 


1477 


U42831 


Caenorhabdi t 
is elegans 


coded. for by C. elegans cDNA 
yk99b4.3; similar to human 
transforming protein 
(PIR:S22157) 


846 


44 


1478 


X62447 


Homo sapiens 


i PR 264 


543 


61 


1479 


X82209 


Homo sapiens 


MN1 


7116 


100 


1480 


D10536 


Pan paniscus 


MHC. class I A 


675 


84 
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ID 
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ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH - 
WATERMAN 
SCORE 


IDENTITY 


1481 


AL078S99 


Homo sapiens 


dJ9SlC6.1 (novel protein 
similar to C. elegans 
F55A12.9 (Tr:P91086)) 


1274 


65 


1482 


298977 


Schizosaccha 

romyces 

pombe 


putative vacuolar protein 


256 


29 


1483 


AB005662 


Mus mus cuius 


JNK/SAPK- associated protein- 1 


4968 


92 


1484 


AL050120 


Homo sapiens 


hypothetical protein 


716 


100 


1485 


M27878 


Homo sapiens 


DNA binding protein 


1006 


53 


1486 


Y69161 


Homo sapiens 


Amino acid sequence of a 
partial protein Icinase . 


575 


99 


1487 


X84156 


Saccharomyce 
s cerevisiae 


ATH1 


341 


29 


1488 


AP038963 


Homo sapiens 


RNA helicase 


446 


34 


1489 


U56966 


Caenorhabdit 
is ele9ans 


coded for by C. elegans cDNA 
yk30b3.5; coded for by C. 
elegans cDNA yk30b3.3 


620 


42 


1490 


AE000989 


Archaeoglobu 
s fulgidus 


enoyl-CoA hydra tase (fad- 4) 


533 


46 


1491 


M80633 


Rattus 
norvegicue 


adenylyl cyclase type IV 


| 707 


95 


1492 


Y73342 


Homo sapiens 


HTRM clone 2709055 protein 
sequence . 


3513 


99 


1493 


Y17220 


Homo sapiens 


Human secreted protein (clone 
fj283-ll) . 


452 


37 


1494 


AF133670 


Mus musculus 


ARL-6 interacting protein- 2 


701 


97 


1495 


Y94897 


Homo 
sapiens 


Human protein clone HP10574 . 


1371 


100 


1496 


AL049699 


Homo sapiens 


dJ747H23.2 (novel protein) 


1550 


100 


[ 1497 


AF037447 


Homo sapiens 


ribosomal S6 protein kinase 


2427 


100 


1498 


AL445067 


Thermoplasma 
acidophilum 


putative target YPIj207w of 
the HAP2 transcriptional 
complex related protein 


269 


35 


14 99 


AB03994 7 


Homo sapiens 


Xlll>-binding protein 51 


227 


36 


1S00 


AJ2777SD 


Homo sapiens 


UBASH3A protein 


3 509 


100 


1501 


AL050333 


Homo 
sapiens 


dJ93K22.1 (novel protein 
(contains DKFZP564B116) ) 


2439 


100 


1502 


AF17989S 


Homo sapiens 


TALE homeobox protein Meis2h 


1140 


100 


1503 


AF178948 


Homo sapiens 


TALE homeobox protein Meis2a 


1177 


100 j 


1504 


Y5300S 


Homo sapiens 


Human secreted protein clone 
pn749_8 . protein sequence SEQ 
ID N0:16. 


1442 


99 


1505 


X82494 


Homo sapiens 


f ibulin-2 


3580 


99 


1 1506 


X98296 


Homo sapiens 


ubiquitin hydrolase 


783 


42 


1507 


AL03454 8 


Homo sapiens 


dJ1103G7.6 (novel protein) 


1098 


100 


1S08 


Y76144 


Homo sapiens 


Human secreted protein 
encoded by gene 21- 


1736 


100 


1509 


AF220182 


Homo sapiens 


uncharacterized hypothalamus 
protein HT008 


1181 


98 


1510 


U64601 


Caenorhabdit 
is elegans 


Gene probably begins in the 
next cos raid 


415 


58 


1511 


AL356192 


Neurospora 
crassa 


related to mdmi protein 


196 


29 


1512 


D17629 


Homo 
sapiens 


N-acetylgalactosamine 6- 
sulfate sulfatase (GALNS) 


1829 


100 


1S13 


AF168717 


Homo sapiens 


x 009 protein 


694 


99 


1514 


AJ243531 


Homo sapiens 


nM15 protein 


735 


100 


1515 


AC003672 


Arabidopsis 
thai i ana 


putative C3HC4-type RING zinc 
finger protein 


407 


30 


1516 


AF115435 


Rattus 
norvegicus 


syntaxin 17 


1374 


90 


1517 


AF00314 0 


Caenorhabdit 
is elegans 


C44E4.5 gene product 


274 


31 


1518 


AB002584 


Rattus 
norvegicus 


beta- alanine -pyruvate 
aminotransferase 


2238 


82 


1519 


AL121764 


Schizosaccha 


yeast atpl2 protein precursor [ 


270 


30 
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SEQ 
ID 

NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- . 
WATERMAN 

m l# »m AiMM mm mm w 

SCORE 


V 

IDEKTTTV 






romyces 
pombe 


homo log 






1520 


AF25S910 


Homo 
sapiens 


vascular endothelial 
junction-associated molecule 


547 


100 


L 1521 


D31764 


Homo sapiens 


KIAA0064 


170 


27 


1522 


Y66634 


Homo 
sapiens 


Membrane- bound protein 
PRO190. 


985 


100 


1523 


Y94450 


Homo sapiens 


Human inflammation associated 
protein 


250 


43 


1524 


AC0001O7 


Arabidopsis 
thaliana 


F17F8.22 


277 


37 


1525 


AF109377 


Mus musculus 


ldlBp 


1277 


83 


, 1526 


AL031427 


Homo sapiens 


dJ167A19.4 (novel protein) 


1432 


99 


1527 


Y08135 


Mus musculus 


acid sphingomyelinase- like 
ohosuhodi est erase 


1496 


79 


1S28 


AK024423 


Homo sapiens 


FLJD0012 protein 


611 


100 


1529 


AF154 5D2 


Homo saoiens 


rr~. i"i PQfpnt" cell inp 
diDBDtidase 


679 


100 

<>4. mm* W 


1530 

* mm +mr V 


AF205598 


Homo saoiens 


t ransDOsase - 1 ike nrote in 


1368 


100 


1531 


AF251039 


Homo sapiens 


putative zinc finger protein 


1420 


50 j 


-I 3 J .& 


n / sous 




encoded by gene 77 clone 

HOEAS24 




• 


X -? -> J 






l?j»r» -fiTP V"i i rtr\ ^ net riTnt p{ n i 
RanBPS 


mJ § \J r 


99 

■m* mf 


1534 


ACO0719O 


Arabidopsis 
thaliana 


F23N19.9 


3 74 


37 


1535 


AB027564 


Homo sapiens 


D1NB1 


4482 


100 






nomo sapiens 


tiumari secrecea protean 


1 "7*7 
Oil 


o / 


1537 


Y50907 


Homo sapiens 


Human fetal brain cDNA clone 
vb3_l derived protein. 


3593 


99 


T f -j 

153 8 


Ar 0173 68 


mus muscu-ius 


iaciogenitai ayspiasia 
procexn ^ 


-L / / 


•a / 


1 CI Q 




Homo sapiene 


cprungofline Kinase 






lb4U 




riomo sapiens 




•CV J O 


1 no 


1541 


AF000195 


Ca enorhabdi t 


Contains similarity to Pfam 

LlvJUictill . rr JUi.o J \ rrii f e 

C/T»>~p»— 0 H F-valuesl Qo. OS 

WW Jto C — ^ W a O f i-> VOX — «fc • ^ W W hJ ^ 

N=l 


379 


42 




Y7 1159 


• 


interact ina Drotein, 
myomegalin. 


9415 

m* m* *mm. mm* 


99 


1543 


X76092 


Homo sapiens 


DNA binding protein RFX3 


3327 


100 


1544 


AB015330 


Homo sapiens 


HRIHFB2007 


631 


50 


154 5 


AF198487 


Homo sapiens 


transcription factor LBP-lb 


2822 


100 


1546 


AF016417 


Caenorhabdi t 
is elegans 


Similar to BZIP transcription 
factor 


518 


42 


1547 


X55885 


Homo s ap i en s 


KDEL recent or 


1106 


100 


154 8 

^m> ^ m w 


AB035495 


Carassius 
auratus 


ubicruitin-activatinor enzrvme 
Bl 


836 


42 


154 9 

-m* ^ 


AL0217O7 


Homo saoiens 


dJ50BI15.4 (KIAA0668) 

V*V mm m* mm •» ^ — -** * A \ mmmm^ mm m mw mw mm mm f 


3688 


100 


1550 


AJ223978 


Bacillus 
subtil is 


YvqK protein 


292 


42 


1551 


AF145615 


Drosophila 
melanocfaetpp 

l 111— uliWMCI D 


BCDNA.GH033 77 


822 


44 


1552 


AL157734 


Schi zosaccha 

rorayces 

porabe 


putative mannosyl transferase 
involved in N-glycosylation 


435 


37 


1553 


AF079S27 


Mus musculus 


IER5 


691 


63 


1554 


AB026291 


Rattus 
norvegicus 


acetoacetyl-CoA synthetase 


1099 


88 


1555 


Y44722 


Homo sapiens 


Human immune system molecule, 
XSNO-3. 


1780 


99 


1556 


AF116553 


Drosophila 
melanogaster 


antennal-specif ic short-chain 
dehydrogenase/reductase 


277 


32 


1557 


Y71056 


Homo sapiens 


Human membrane transport 


1975 


99 
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protein, MTRP-1. 






1558 


Y71056 


Homo sapiens 


Human membrane transport 
protein, MTRP-1. 


1975 


99 


1559 


Y71056 


Homo sapiens 


Human membrane transport 
protein, MTRP-l. 


1894 


97 


1560 


AF092050 


Mus muscuius 


beta-1, 3-N- 

ace tylglucosaminyl transferase 


262 


44 


1561 ; 


AL109827 


Homo sap x ens 


d*J3 09K20.2 (acrosomal protein 
ACR55 (similar to rat sperm 
antigen 4 (SPAG4 ) ) ) 


1607 


97 


1562 


Aai31890 


Homo sapiens 


DNA polymerase lambda 


3002 


100 


1563 


AL035424 


Homo sapiens 


dA22D12.1 (novel protein 
similar to Drosophila Kelch 
proteins) 


3015 


100 


1564 


AC002400 


Homo sapiens 


Gene product with similarity 
to Ubiquitin binding enzyme 


2790 


100 


1565 


AC005306 


Homo sapiens 


R27216 1 


919 


82 


1566 


AF000195 


Caenorhabd i t 
is elegans 


Contains similarity to Pfam 
domain: PP00169 (PH) , 
Score=20.6, E - value =1 . 9e-05, 
N=l 


550 


45 


1567 


AB033281 


Homo 
sapiens 


F-box and WD- repeats protein 
beta-TRCP2 isoform C 


2879 


100 


1568 


D4 9473 


Mus rausculus 


truncated form of Soxl7 


1047 


78 


1569 


AK025270 


Homo sapiens 


unnamed protein product 


210 


91 


1570 


X75756 


Homo sapiens 


protein kinase C mu 


4797 


99 


1571 


AF145713 


Homo sapiens 


SCHIP-1 


2388 


100 


1572 


AE003831 


Drosophila 
melanogas ter 


CG18445 gene product 


180 


31 


1573 


AF074603 


S treptomyces 
griseus 
subsp . 
griseus 


NonF 


205 


38 


1574 


U28993 


Caeno r habdi t 
is elegans 


F22D3.3 gene product 


144 


27 


1575 


AF129507 


Homo sapiens 


transcription factor ICBP90 


287 


68 


1576 


X64878 


Homo sapiens 


oxytocin receptor 


2002 


100 


1577 


AF237711 


Drosophila 
melanogaster 


Diablo 


421 


54 


1578 


G00975 


Homo napiens 


Human secreted protein, SEQ 
ID NO: 5056. 


480 


100 


1579 


AF24 8744 


Cryp t o spo r i O! 
ium parvum 


thrombospondin- related 
adhesive protein 


123 


33 


1580 


AL121782 


Homo sapiens 


dJ585I14.2 (novel protein 
(translation of cDNA 
Em:AK000219) ) 


663 


100 


1581 


AF041853 • 


Homo sapiens 


Jcinesin family member protein 
KIF3A 


345 


i "i 
33 


1582 


AF025441 


Homo sapiens 


Opa- interacting protein OIP5 


il98 


100 


1583 


AE001803 


Thermotoga 
maritima 


glycerate kinase, putative 


3 45? 




1584 


AF252283 


Homo sapiens 


Kelcn-like 1 protein 


39 / J 




1585 


AF169675 


Homo 
sapiens 


leucine- rich repeat 
transmembrane prote in FLRT1 


~i A Q A 

3 4 94 


O Q 


1586 


AF118274 


Homo sapiens 


DNb-5 


2628 


97 


1 COT 


V7 Q A A A 

A / y44U 


HOulO SupiCIla 




3167 


99 


1588 


X99802 


Homo sapiens 


ZYG homologue 


3966 


99 


1589 


AF169803 


Homo sapiens 


f lavohemoprotein b5+bSR 


2563 


100 


1590 


Y29861 


Homo sapiens 


Human secreted protein clone 
cb98 4. 


181 


47 


1591 


Z25535 


Homo sapiens 


nuclear pore complex protein 
hnupl53 


7567 


99 


1592 


X13293 


Homo sapiens 


B-myb protein (AA 1-7 00) 


3678 


99 


1593 


M74027 


Homo sapiens 


mucin 


242 


27 


1594 


AL139314 


Schizosaccha 
romyces 


hypothetical protein 


235 


54 



184 



nxjQrv^rirv «-wn 



0153312A1 I > 



f 



WO 01/53312 PCTAJS00/34263 



TABLE 2 



SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 


% 

IDENTITY 






pomb6 








1595 


W78324 


Homo sapiens 


Fragment of human secreted 
procexn cncoueo tjy gene oi . 


1318 


98 


1596 


■V q /i one 


HOluO Sapiclla 


xiuntan scCiclcQ proucin cjiouc 
TD NO • lfl 


OTIC 

« «o o 


98 

JO 


1597 


AF174605 


Homo sapiens 


F-box protein Fbx2 5 


1408 


99 


t c a a 




rioroo 




9676 


98 


1599 


X73114 


Homo sapiens 


slow MyBP-C 


5568 


95 






riOiuO sapiens 






inn 
^ w 


1601 


Y00876 


Homo 
sapiens 


Human LAPH-1 protein 
sequence . 


1149 


98 


1602 


AJ2233 51 


Homo sapiens 


hika- 1nc.eracc.1ng procein j 




qo 


1603 


AJ222801 


Homo sapiens 


neucrai spnmgorayei inase 






1604 


AJ222801 


Homo sapiens 


neutral sphingomyelinase 


1601 


99 


1605 


AF185576 


Mua mus cuius 


POZ/zinc finger transcription 
factor ODA-8 


34 3 5 




1606 


AF093744 


Homo sapiens 


unknown 


131 


100 


1607 


A12142 


synthetic 
construct 


IFN -pseudo -omega 2 


e> rt f\ 
800 


98 


1608 


Y57949 


Homo sapiens 


Human transmembrane protein 
HTMPN-73. 


i □ ^ a 
loot) 


i i n A 
JL UO 


1609 


AF151044 


Homo sapiens 


HSPC210 


681 


97 


1610 


X15218 


Homo sapiens 


s)ci protein (AA 1 - 728) 


376S 


100 


1611 


Y08200 


Homo sapiens 


rab geranylgeranyl 
transferase 


2976 


100 


1612 


AF220560 


Homo sapiens 


B/K protein 


2486 


99 


1613 


AC004481 


Arabidopsis 
thaliana 


nodulin-like protein 


371 


26 


1614 


Y09501 


Homo sapiens 


NADH- cytochrome -b5 reductase 


1607 


100 


1615 


Y15521 


Homo sapiens 


start position 1 


3150 


97 


1616 


AJ010750 


Rattus 
norvegicus 


Castration induced prostatic 
apoptosis related protein- 1, 
(CIPAR-1) 


890 


6^ 


1617 


XS8079 


Homo sapiens 


S100 alpha protein 


481 


100 


1618 


Y66678 


Homo 
sapiens 


Membrane -bound protein 
PRO1009 . 


967 


100 


1619 


AJ242973 


Homo sapiens 


peptide methionine sulfoxide 
reductase 


929 


100 


1620 


AF150733 


Homo sapiens 


AD- 014 protein 


*i q a 
zoo 


•t nn - 
IUU 


1621 


AJ007509 


Homo sapiens 


ElB-55kDa-aesociated protein 


4646 


98 


1622 


X64177 


Homo sapiens 


metallothionein 


380 


100 


1623 


AE001045 


Archaeoglobu 
s fulgidus 


A. fulgidus predicted coding 
region AF0859 


240 


3B 


1624 


AI>3 55013 


Schi20saccha 

romyces 

pombe 


mitochondrial carrier protein 


4 03 




1625 


Y66746 


Homo 
sapiens 


Membrane -bound protein 

PROH98 . 


llo*t 


inn 


1626 


D90053 


Sus scrofa 


aestrin 


ft S"^ 
O O J 


100 


1627 


Y35954 


Homo sapiens 


Extended human secreted 
protein sequence, SEQ ID NO. 
203 . 


/ 9b 


inn 

■X. w w 


162B 


AL031775 


Homo sapiens 


dJ30M3.2 (novel protein) 


470 


100 




flr J. 


PIUS luUSCUluS 


ujiKuown 


286 


68 


1630 


AF017096 


Drosophila 
melanogaster 


similar to C. elegans 
R10H10.6 and S. cerevisiae 
YD8419.03C 


493 


61 


1631 


X03077 


Homo sapiens 


lactate dehydrogenase -A 


1704 


100 


1632 


AF151084 


Homo sapiens 


HSPC250 


763 


100 


1633 


AJ001874 


Homo sapiens 


orf 


255 


97 


1634 


AC012187 


Arabidopsis 
thaliana 


Contains weak similarity to 
GATA-6 DNA- binding protein 
gb|H36135, gb|Z26200 come 
from this gene. 


143 


38 



185 
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ID 
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ACCESSION 
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SPECIES 


DESCRIPTION 


SMITH - 
WATERMAN 
SCORE 


IDENTITY 


1635 


AF026246 


Homo sapiens 


KERV-E integrase 


411 


90 


1636 


YS0943 


Hono sapiens 


Human adult brain cDNA clone 
ve8_l derived protein. 


1126 


95 


1637 


AF134593 


Homo sapiens 


L-pipecolic acid oxidase 


2068 


99 


1638 


AJ238247 


Mus musculus 


putative phosphatase subunic 


1948 


96 


1639 


Y94942 


Homo sapiens 


Human secreted protein clone 
yk25l_l protein sequence SEQ 
ID NO:90. 


1320 


100 


1640 


AF235030 


Homo sapiens 


BM88 antigen 


766 


99 


1641 


AF233288 


Drosopiiila 
raelanogaster 


WDS 


358 


26 


1642 


M19351 


Mus musculus 


immunoglobulin heavy chain 
binding protein 


145 


34 


1643 


Y70452 


Homo sapiens 


Human membrane channel 
protein-2 (MECHP-2) . 


1352 


100 


1644 


AF176520 


Mus musculus 


WD repeat -containing F-box 
protein FBW5 


2676 


88 


1645 


W67816 


Homo sapiens 


Human secreted protein 
encoded by gene 10 clone 
HCEMU42 . 


1156 


100 


1646 


X67155 


Homo sapiens 


mitotic kinase-like protein- 1 


4456 


99 


1647 


M63180 


Homo sapiens 


threonyl-tRNA synthetase 


1040 


61 


16*48 


Y87342 


Homo sapiens 


Human signal peptide 
containing protein HSPP-119 
SEQ ID NO: 119. 


1566 


93 


1649 


R95332 


Homo sapiens 


Tumor necrosis factor 
receptor 1 death domain 
ligand (clone 3TW) . 


4137 


100 


16S0 


AC007136 


Homo sapiens 


Putative map kinase 
interacting kinase 


856 


99 


1651 


AB015346 


Homo sapiens 


EpslSR 


4464 


99 


1652 

* 


AL161576 


Arabidopsis 
thaliana 


putative protein 


1341 


48 


1653 


AC005313 


Arabidopsis 
thaliana 


putative calmodulin 


288 


28 


1654 


AL.031428 


Homo sapiens 


dJ184J9.1 (KIAA0601 protein) 


3526 


100 


1655 


AL031428 


Homo sapiens 


dJl84J9.1 (KIAA0601 protein) 


3526 


100 


1656 


AB017910 


Dictyosteliu 
m discoideum 


myoM 


297 


32 


1657 


Y28919 


Homo 
sapiens 


Human regulatory protein 
HRGP-S. 


2251 


99 


1658 


AF056191 


Homo sapiens 


TPA inducible protein 


2744 


98 


1659 


U76846 


Arabidopsis 
thaliana 


uJbiqui tin- specific protease 


137 


35 


1660 


AL078627 


Schizosaccha 

romyces 

pombe 


actin-like protein; (2 act in 
domains) 


320 


34 


1662 


X52022 


Homo sapiens 


collagen type VI, alpha 3 
chain 


16274 


99 


1663 


AF300648 


Homo 
sapiens 


guanine nucleotide binding 
protein beta subunit 4 


1B11 


100 


1664 


AP214736 


Homo sapiens 


EH domain containing protein 
2 


2774 


100 


1665 


Z48613 


Saccharomyce 
s cerevisiae 


unknown 


138 


26 


1666 


AF177385 


Homo 
sapiens 


cytochrome c oxidase assembly 
protein isoform 2 


1395 


99 


1667 


AC007842 


Homo sapiens 


BC331191_1 


1581 


47 


1668 


S67513 


Borna 
disease 
virus BDV, 
WT-1, Halle 
Bl/91, horse 
brain, field 
isolate. 
Peptide, 370 


p40 


397 


43 
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ID 
NO: 


ACCESSION 
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| SPECIES 


DESCRI PTION 


SMITH- 
WATERMAN 
SCORE 


IDENTITY 






aa 








1669 


Z99753 


Schi zosaccha 

rorayces 

potnbe 


putative NOLI - ICO P 2 - sun family 
nucleolar protein 


569 


47 


1670 


G03130 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 7211. 


427 


97 


1671 


M96625 


Gallus 
gallus 


cardiac muscle tensin 


1185 


54 


1672 


AF174482 


Homo sapiens 


polycomb 3 


2005 


99 


1673 


Y51846 . 


Homo sapiens 


Human 18.1 homo log protein 
fragment . 


233 


29 


1674 


AF255334 


Homo sapiens 


EXP35 


152 


29 


1675 


Y94367 


Homo 
sapiens 


Human protein clone HP10563. 


109 


30 


1676 


Y25712 


Homo sapiens 


Human secreted protein 
encoded from gene 2. 


3043 


99 


1677 


Y25712 


Homo sapiens 


Human secreted protein 
encoded from gene 2 . 


1580 


91 


1678 


AF163151 


Homo sapiens 


dentin sialophosphoprotein 
precursor 


170 


17 


1679 


AF163151 


Homo sapiens 


dentin sialoDhosDhoDrotein 
precursor 


170 


17 


1680 


AK024453 


Homo sapiens 


FLJ0Q045 nrotein 


1349 


ion 

Jb \J V 


1681 


AF019236 


Dictvosteliu 
m discoideura 


TxoD 

• ^» *^ 


613 

W Jk- 


34 


1682 


AJ243459 


Lex shman i a 
major 


Droteonhosohoalvcan 


153 


26 


1683 


Z69369 


Schi zosaceha 

romyces 

pombe 


nutative GTP-bindincr orotein 


560 


46 


1684 


X94910 


Homo sapiens 


ERp28 


1334 


100 


1685 


AF286475 


Takifugu 
rubripes 


retinitis pigmentosa GTPase 
regulator- like protein 


196 


19 


1685 


AF191298 


Homo sapiens 


vacuolar sorting protein 35 


4087 


100 


1687 


AJ275986 


Homo sapiens 


transcription factor 


2958 


100 


1688 


AJ275986 


Homo sapiens 


transcription factor 


1886 


88 


1689 


X07311 


Drosophila 
melanogaster 


heat shock protein 


138 


43 


1690 


AF240463 


Rattus 
norvegicus 


LISl-interacting protein 
NUDE1 


1383 


83 


1691 


AJ272078 


Homo sapiens 


APOBEC-1 stimulating protein 


1256 


68 


1692 


AJ272079 


Homo sapiens 


APOBEC-i stimulating protein 


1336 


60 


1693 


AP177942 


Xenopus 
laevis 


katanin p60 


1664 


66 


1694 


AP263539 


Homo sapiens 


arginine N-methyl transferase 


1774 


100 


1695 


AP222689 


Homo 
sapiens 


protein arginine N- 

methyl transferase 1 -variant 2 


1182 


81 


1696 


AK000193 


Homo sapiens 


unnamed protein product 


1060 


100 


1697 


AB041035 


Homo sapiens 


kidney superoxide- producing 
NADPH oxidase 


3122 


100 


1698 


AB04103S 


Homo sapiens 


kidney superoxide-producing 
NADPH oxidase 


2181 


100 


1699 


AF025772 


Homo sapiens 


C2H2 zinc finger protein 


48B 


54 


1700 


Y44676 


Homo sapiens 


Human ARF- Related Protein- 1 
(HARP-1) . 


93 8 


97 


1701 


AK022407 


Homo sapiens 


unnamed protein product 


315 


98 


1702 


AB024574 


Homo sapiens 


GTP-binding like protein 2 


1172 


100 


1703 


AF055078 


Homo sapiens 


zinc finger protein 42 


421 


52 


1704 


AF198092 


Kus musculus 


RP42 


1057 


77 


1705 


AE003573 


Drosophila 
melanogaster 


CG12474 gene product 


161 


33 


1706 


AB036345 


Drosophila 
melanogaster 


aquaporin 


164 


24 


1707 


Y55927 


Homo sapiens 


Human STLK2 protein. 


2146 


100 


1708 


U27121 


Danio rerio 


G12 


212 


47 


1709 


AL391710 


Arabidopsis 


putative protein 


505 


50 
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WATERMAN 
SCORE 
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thaliana 








1710 


B01311 


Homo sapiens 


Human PR0241 polypeptide. 


1649 


97 


1711 


U40750 


Mus raus cuius 


formin binding protein 30 


4561 


85 


1712 


AJ011118 


Mus mus cuius 


skeletal muscle and cardiac 
protein 


1490 


89 


1713 


AF25S303 

- 


Homo 
sapiens 


membrane-associated nucleic 
acid binding protein 


4416 


95 


1714 


AF255303 


Homo 
sapiens 


membrane -associated nucleic 
acid binding protein 


2960 


100 


1715 


U08227 


Rattus 
norvegicus 


Ras- related protein 


511 


51 


1716 


AF168795 


Rattus 
norvegicus 


schlaf en-4 


1129 


44 


1717 


AF196304 


Homo sapiens 


SUMO- 1- specif ic protease 


5804 


99 


1718 


AL3S5737 


Homo sapiens 


HMG20A 


1782 


100 


1719 


AB029333 


Halocynthia 
roretzi 


Hr PET - 1 

• 


1069 


46 


1720 


AF071317 


Mus rausculus 


COPS complex sub unit 7b 


1297 


97 


1721 


AJ272215 


Homo sapiens 


HEYIi protein 


1681 


99 


1722 


G01982 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 6063 . 


718 


100 


1723 


AL032643 


Caenorhabdi t 
is elegans 


similar to On character! zed 
protein family UPF0034, 


825 


41 


1724 


G01972 


Homo sapiens 


Human secreted protein, SEQ 
ID NO : 6053 . 


586 


92 


1725 


Y94441 


Homo 
sapiens 


Human Adipose Specific 
Protein 1. 


1231 


1D0 


1726 


AF2S5443 


Homo sapiens 


CGI -2 01 protein 


4397 


99 


1727 


AF183426 


Homo sapiens 


HT004 protein 


1810 


99 


1728 


D10884 


Bos taurus 


neurocaicin 


1002 


99 


1729 


Z18529 


Gallus 
gallus 


tensin 


1411 

• 


84 


1730 


Z73423 


Caenorhabdi t 
is elegans 


cDNA EST EMBIj:ZI4 908 comes 
from this gene-cDHA EST this 
gene 


233 


41 


1732 


AFOS0891 


Homo sapiens 


PRO0105 


470 


30 


1733 


AJ277724 


Homo sapi en s 


histone deacetylase 8 


2015 


100 i 


1734 


G04050 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 8131. 


503 


95 


1735 


D45913 


Mus mus cuius 


leucine -rich-repeat protein 


3531 


94 


1736 


AF096709 


Drosophila 
virilis 


failed axon connections 
protein 


276 


32 


1737 


AF195120 


Homo sapiens 


dynactin p62 subunit 


2417 


99 j 


1738 


UL5314 


Caenorhabdi t 
is elegans 


contains similarity to Pfam 
family PF01772 N=l 


206 


37 


1739 


X54618 


l»isteria 

monocytogene 

s 


phosphadidyl inositol specific 
phoapholipase C 


134 


27 


1740 


AL031658 


Homo sapiens 


dJ3i0O13.4 (novel protein 
similar to predicted C. 
elegans an c. intestinalis 
proteins) 


123 


31 


1741 


Y35924 


Homo sapiens 


Extended human secreted 
protein sequence, SEQ ID NO. 
173 . 


1013 


99 


1742 


AC013354 


Arabidopsis 
thaliana 


F15H18.15 


202 


32 


1743 


W75771 


Homo 
sapiens 


Human GTP binding protein 
APD08. 


1932 


59 


1744 


W75771 


Homo 
sapiens 


Human GTP binding protein 
APD08. 


1854 


61 


1745 


AF221098 


Homo 
sapiens 


Ral guanine nucleotide 
exchange factor RalGPSIA 


1224 


70 


1746 


Y99372 


Homo sapiens 


Human PRO1430 (UNQ736) amino 
acid sequence SEQ ID NO: 116. 


1332 


99 


1747 


Y94294 


Homo sapiens 


Human coenzyme A-utilising j 


842 


100 
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TABLE 2 



SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 


np«?n? t PTTfiN 

XT X lU[t 

- 


1 CMTTH. 
| oPllin- 

1 WATERMAN 

I SCORB 


T S 1 








enzyme CoAEN-2. 






1748 


AK02443 6 


1 Homo saoiens 


FLJ00026 protein 


j 1619 


100 " j 


1749 


AE000877 


Methanobacte 

riura 
I thermoautotr 
j ophicum 


conserved protein 


1 231 


"?6 1 


17S0 


AF101361 - 


Drosophila 
j melanogaster 


Abnormal X segregation 


193 


33 | 


17S1 


Y15067 


j Homo sapiens 


ZNF232 


j 889 


100 


1752 


AF251038 


1 Homo sapiens 


GAP-liJce protein 


I 822 


100 I 

-i- v w s 


\ 1753 


AC003093 


j Homo sapiens 


OXYSTEROL- BINDING PROTEIN; 
45% similarity to P22059 
<PID-.gl29308) 


T 352 


1 


1754 


X690B9 


| Homo sapiens 


165kD protein 


j 5703 


99 ] 


1755 


AL0497S5 


j Homo sapiens 


dJ622L5.3 (novel nrotein) 


I X039 


100 ! 


1756 


AL031393 


I Homo sapiens 


CU733D15.1 {Zinc-finger 
protein) 


2765 


100 j 


1757 


AB040672 


j Homo sapiens 


UDP-GalNAc: polypeptide N- 
acetvlgalactosaminvl transfers 
se 


j 2020 


99 1 


1758 


AL022238 


j Homo sapiens 


dJ1042Kl0.4 (novel protein) 


[ 776 


43 


1759 


AF117653 


Homo sapiens 






Cil I 
1 


17^0 


Y12065 


| Homo sapiens 


hNop56 


| 2959 


99 1 


1761 


AL.049712 


1 Homo sarsieirs 


uuoodlj . ^ mucicuiar proccin 
hNop56) 




O O i 

9S» | 


1762 


AC0023 94 


1 Homo 
| sapiens 


piuuucc wi _n s lull x 3l Tlx Cy 
to dynein beta subunit 




51 1 


1763 


AF169017 


Homo sapiens 


formimino transferase 
cy clode amina s e 


j 877 


100 


1764 


w J X X 


UAfnn cdt%^ Airs a 


nutnan rornu.rai no transferase 

t^y rr~* ~] /vfd^in c a. 1 & t~ ^ \ n«*A t* a ^ m 

\>yuxvacauu.uase 1 1 cca; protisin r 


596 


100 j 


1765 


AB013365 


Bacillus 
ha 1 odurans 


*YlaF 1 




^ ill I 

j4 1 


176£ 


Y38421 


Homo saoiens 


encoded bv crene No 36 i 


J- 3 


"7 1 1 
1 


1767 


AC009176 


Arabidcpsis 
thaliana 


putative ribulose-1, 5- \ 
bisphosphate 

carboxylase /oxygenase small 
subunit N-methyl transferase I | 


21^ 


27 j 
1 


1768 


AK000647 1 


Homo sapiens 


unnamed protein product 


737 


99 I 


1769 


AJ238982 j 


Homo sapiens 


VNN3 protein j 


2665 


99 


1770 


U73522 J 


Homo sapiens 


AMSH " 1 


1214 


5£ 


1771 


U89435 | 


f<us musculus 


unknown j 


829 


66 | 


1772 


S70011 j 


Rat t us sp. 


tricarboxylate carrier j 


1604 


95 ~\ 


1773 


AL035086 | 


Homo sapiens 


dJ44A20.2 (novel protein) j 


2036 


100 | 


1774 


Y99426 j 


Homo sapiens 


Human PRO1604 (UNQ785) amino j 
acid sequence SEQ ID NO: 308. | 


1057 


99 1 


1775 


AF11033 0 j 


Homo sapiens 


glutaminase j 


3146 


100 | 


1776 


AJ269529 j 


Homo sapiens 


glycerol 3 -phosphate permease 


2787 


100 j 


1777 


ZB1579 | 


Ca enorhabdi t 
is elegans 


cDNA EST yk76fl.5 comes from 
this gene 


232 


31 I 


1778 


AY007239 j 


Homo sapiens 


monooxygenase X 1 


1875 


99 j 


1779 


AL109608 1 


Schizosaccha 

romyces 

pombe 


oxyscerol- binding protein 
family 


644 


38 


1780 


AF254260 


Homo sapiens 


tuftelin 1 


1729 


100 | 


1781 


L07924 


Mus musculus 


guanine nucleotide 
dissociation stimulator 


247 


50 


1782 


AF295773 


Homo 
sapiens 


ral guanine nucleotide 
dissociation stimulator 


142 


49^ 


1783 


AK024475 


Homo sapiens 


FXJ00068 protein 


4333 


100 


1784 


AK024475 


Homo sapiens ] 


FLJ00068 protein 


3996 


93 


1785 


G03933 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 8014 . 


570 


100 


1786 


S82637 


Homo sapiens 


Ig lambda- like gene/beta- 


247 


100 
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SEQ 
ID 

NO: 


ACCESSION 
NUM3ER 


SPECIES 


DESCRIPTION 


SMITH - 
WATERMAN 
SCORE 


V 

IDENTITY 








glucuronidase exon 11 bomolog 







TRADOCS: 1 4 1 6280. 1 (%CT40 1 !. DOC) 
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TABLE 3 



SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 


2 


BL0 024 0 


Receptor tyrosine kinase 
class III proteins. 


BLO0240B 24.70 8.250e- 
12 157-181 


3 


PR00109 


TYROSINE KINASE 
CATALYTIC DOMAIN 
S IGNATUKB 


PR00109D 17.04 8.085e- 
13 358-381 


4 


BL00 02 8 


Zinc finger, C2H2 type, 
domain proteins. 


BL00028 16.07 9.400e- 
10 1129-1146 BL0O028 
16.07 1.257e-09 820- 
837 


5 


BL00023 


Type II fibronectin 
collagen- binding domain 
proteins . 


BL00023 24.31 8.920e- 
33 413-450 BI*00023 
24.31 4.545e-27 353- 
390 


6 


BL00023 


Type II fibronectin 
collagen -bin ding domain 
proteins . 


BL00023 24.31 8.920e- 
3 3 413-450 BL00023 
24.31 4 .545e-27 353- 
390 


7 


BL00023 


Type II fibronectin 
collagen- binding domain 
proteins . 

• 


BIi00023 24 .31 8.920e- 
33 413-450 BL00023 
24.31 4.545e-27 353- 
390 


8 


BL00023 


Type II fibronectin 
collagen -binding domain 
proteins. 


BL00023 24.31 8.920e- 
33 413-450 BIi00023 
24.31 4.545e-27 353- 
390 


9 


BL01160 


Kinesin light chain 
repeat proteins. 


BL01160B 19. S4 5.119e- 
09 863-917 


10 


PR00464 


E- CLASS P450 GROUP II 
SIGNATURE 


PR00464D 17.40 6.182e- 
12 294-312 PR00464G 
12.41 4.231e-ll 377- 
393 


11 


PR00734 


GIiYCOSYIi HYDROLASE 
FAMILY 7 SIGNATURE 


PR00734I 11.46 4.296e- 
09 502-520 


12 


PF00023 


Ank repeat proteins. 


PF00023B 14.20 C.SOOe- 
10 89-99 PF00023B 
14.20 2.636e-09 56-66 


14 


DM00031 


IMMUNOGLOBULIN V REGION. 


DM00031B 15-41 3.848e- 
09 79-113 


15 


PR00208 


GLIADIN AND LMW GLUTENIN 
SUPERFAMILY SIGNATURE 


PR00208A 12.59 9.868e- 
10 517-535 PR00208A 
12.59 2.233e-09 520- 
538 


17 


PD00066 


PROTEIN ZINC- FINGER 
METAL- BIND I . 

• 


PD00066 13.92 8.200e- 
14 282-295 PD00066 
13.92 9.400e-14 477- 
490 PD00066 13.92 
6.500e-13 505-518 
PD00066 13.92 9.500e- 
13 254-267 PD00066 
13.92 1.429e-12 393- 
406 PD00066 13.92 
6.57le-12 421-434 


18 


BL00845 


CAP-Gly domain proteins. 


BL00845 16.43 2.200e- 
25 55-oU 


20 


BL004 87 


IMP dehydrogenase / GMP 
reductase proteins . 


BL00487E 16.12 5.737e- 
26 154-199 BL00487F 
18.79 8.984e-22 235- 
276 BL004 87G 26.82 
4.082e-12 287-329 


21 


BL00487 


IMP dehydrogenase / GMP 
reductase proteins. 


BL00487E 16.12 5.737e- 
26 154-199 BL00487F 
18.79 8.984e-22 235- 
276 BL004 87G 26.82 
4.082e-12 348-390 


22 


BL00107 


Protein kinases ATP- 
binding region proteins. 


BL00107A 18.39 3.250e- 
26 302-333 
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SBQ ID NO: 



23 



25 



26 



ACCESSION 
NO. 



BL00107 



BLO011S 



BL00420 



DESCRIPTION 



Protein kinases ATP- 
binding region proteins 
Eukaryotic RNA 
polymerase II 
heptapeptide repeat 
proteins . 



Speract receptor repeat 
proteins domain 

proteins . 

Ribosomal protein L23 
proteins . 



RESULTS* 



BL00107A 18.3 9 3.250e- 
26 302-333 



BL00115T 8.45 7.273e- 
29 1208-1242 BL00115Q 
18.08 2.776e-21 953- 
983 BL00115Y 11.86 
8.000e«l7 1604-1650 
BL00115M 19.19 8.130e- 
16 731-774 BL0011SH 
14.34 9.392e-16 463- 
496 BL00115A 15.44 
7.414e-15 43-82 
BL00115R 6.50 6.128e- 
14 983-1010 BL00115J 
16.71 S.289e-14 591- 
617 BL0011SI 8.33 
4.336e-13 535-590 
BL00115L 12.25 5.939e- 
13 662-694 BL00115G 
11.65 6.0Ile-l3 435- 
463 BL0011SK 15.03 
3.417e-10 617-659 
BL00115O 16.76 5.805e- 
10 863-913 BL00115P 
11.54 7.538e-10 913- 
953 BL0011SS 18.24 
7.968e-10 1010-1052 
BL00115U 10.34 4.475e- 
09 1242-1265 



BL00420A 20.42 4.109e- 
11 81-110 BL00420A 
20.42 8.820e-10 84-113 



27 



BL00050 



BL00050A 23.71 9.250e- 
27 94-127 BL00050B 
14.81 8.125e-12 133- 
147 



28 



PR00925 



NONHISTONE CHROMOSOMAL 
PROTEIN HMG17 FAMILY 
SIGNATURE 



PR00925B 3 
10 41-54 



73 3.089e- 



29 



PF00756 



Putative esterase 



PF00756C 14.12 1.108e- 
09 486-516 



32 



BL00557 



FMN- dependent alpha - 
hydroxy acid 
dehydrogenases proteins 



34 



35 



36 



BL005S7D 17.76 5.065e- 
37 274-316 BL00557A 
35.08 8.909e^29 24-73 
BL00557C 15.59 1.000c- 
28 227-257 BL00557B 
21.27 8.898e-22 130- 
169 



PR00629 



SHC FHOSPHOTYROSINE 
INTERACTION DOMAIN 
SIGNATURE 



PR00629E 9.90 5.886e- 
35 299-328 PR0062SF 
10.95 8.364e-32 334- 
361 PR00629B 13.66 
3 .786e-27 224-247 
PR00629A 13.45 8.364e- 
21 206-222 PR00629C 
3.80 4.000e-12 249-261 
PR00629D 12.45 3.739e- 
11 276-286 



PD01270 



RECEPTOR FC 
IMMUNOGLOBULIN AFFIN 



PD01270A 17.22 l.OOOe- 
40 39-79 PD01270B 
22.18 2.875e-38 94-131 
PDO1270D 24.66 3.700e- 
34 171-207 PD01270C 
19.54 3.455e-30 137- 
166 



PD0127O 



RECEPTOR FC 
IMMUNOGLOBULIN AFFIN. 



PD01270A 17.22 l.OOOe- 
40 39-79 PD01270B 
22.18 2.875e-38 94-131 
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SEQ ID NO: 


ACCESSION 


DESCRIPTION 


RESULTS* 








PD01270D 24.66 3 . 700e- 
\ 34 171-207 PD01270C 
19.54 3.455e-30 137- 
166 


37 


BL00412 


Neuromodulin (GAP- 43) 
proteins . 


BL0O412C 10.28 9.24le- 
10 264-298 


38 


BL»004 12 


Neuroinoauiin iGAP-43) 
proteins . 


BL00412C 10.28 9.24le- 
10 264-298 


39 


BL00412 


Neuromodulin (GAP- 43) 
proteins . 


BL00412C 10.28 9.241e- 
10 264-298 


40 


PR00380 


KINESIN HEAVY CHAIN 
SIGNATURE 


PR00380B 12.64 7.366e- 
14 342-360 PR00380C 
13.18 6.927e-13 375- 
394 PR00380D 9-93 
2.180e-12 429-451 
PR0O38OA 14.18 5 . 154e- 
12 143-165 


44 


BL00345 


Ets-domain proteins. 


BL00345B 21.28 l.OOOe- 
40 239-290 BL00345A 
13.96 2.452e-l4 204- 
223 


45 


BL00345 


Ets-domaln proteins. 


BL00345B 21.28 l.OOOe- 
40 215-266 BL00345A 
13.96 2.452e-14 180- 
199 


46 


DM01551 


Jew OSTEOINDUCTIVE YOPM 
MEMBRANE OUTER. 


DM01551A 15.63 3.538e- 
26 172-202 DM01551C 
14.62 3.S71e-l7 232- 
252 DK01551B 8.84 
4 . 750e-ll 214-226 


47 


PR00876 


NEMATODE MBTALLOTH I ONE IN 
SIGNATURE 


PR00876B 7.66 9.328e- 
11 246-260 


48 


PD01066 


PROTEIN ZINC FINGER 
ZINC-FINGER METAL - 
BINDING NU. 


PD01066 19.43 4 .231e- 
33 6-45 


50 


BL00972 


Ubiqui tin carboxyl - 
terminal hydrolases 
family 2 proteins . 


BL00972D 22.55 7 . 750e- 
19 994-1019 BL00972A 
11.93 7.120e-l8 216- 
234 BL00972E 20.72 
9.471e-14 1020-1042 
BL00972C 16.48 7.000e- 
13 360-375 BL00972B 
9.45 8.269e-10 302-312 


51 


BL00972 


Ubiqui tin carboxyl - 
terminal hydrolases 
family 2 proteins. 


BL00972D 22.55 7.750e- 
19 990-1015 BL00972A 
11.93 7.120e-18 216- 
234 BL00972E 20.72 
9.471e-14 1016-1038 j 
BL00972C 16.48 7.000e- 
13 360-375 BL00972B 
9.45 8.269e-10 302-312 j 


52 


BL01115 


GTP-binding nuclear 
protein ran proteins . 


BL01115A 10.22 3 . 063e- 
14 10-54 


53 | 


PR00988 


URIDINE KINASE SIGNATURE 


PR00988A 6.39 8.500e- 
17 20-38 PR00988F 
12.23 7.828e-l5 196- 
210 PR00988C 13.64 

PR00988E 8.27 3.872e- 
11 174-186 PR00988D 
5.95 6.878e-10 160-171 
PR00988B 11.60 2 . 915e- 
09 57-69 


55 


PRO 076 2 


CHLORIDE CHANNEL 
SIGNATURE 


PR00762C 9.29 4.682e- 
21 294-314 PR00762D 
11.29 4.l03e-19 509- 
530 PR00762A 14.22 
9.333e-18 199-217 
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1 SBQ ID NO: 


ACCESSION j 
NO. 


DESCRIPTION 


RESULTS* 


I 






PR00762F 15.12 3.100e- 
16 563-583 PR00762B 
12.12 6.063e-l6 230- 
250 PR0C762E 12.07 
2.286e-l5 545-562 
PR00762G 14.13 6.276e- 
13 601-616 


56 


BL00216 


Sugar transport 
proteins . 


BL00216B 27.64 8.800e- 
10 153-203 


S8 


PFOO/791 


Domain present in ZO-l 
and Unc5-like netrin 
receptors . 


PF00791B 28.49 2.049e- 
10 1080-1135 


59 

J 


PF00791 


Domain present in ZO-l 
and Unc5-like netrin 
receptors . 


PF00791B 28.49 2.049e- 
10 1062-1117 


61 


PD01929 


KINASE TYPE RESISTANCE 
ANTIBIOTIC TRANSFERASE 
AM. 


PD01929E 10.76 9.018e- 
09 206-221 




PRO 03 60 


C2 DOMAIN SIGNATURE 


PR00360A 14 .59 7.395e- 
09 680-693 


69 


PR00360 


C2 DOMAIN SIGNATURE 


PRO0360A 14.59 7.395e- 
09 670-683 


70 


PF00651 


BTB (also known as BR- 
C/Ttk) domain proteins. 


PF00651 15.00 8.714e- | 
10 51-64 


72 


DM00179 


w KINASE ALPHA ADHESION 
T-CELL. 


DM00179 13.97 5.304e- 
09 108-118 


73 


BL00239 


Receptor tyrosine kinase 
class II proteins. 


BL00239B 25.15 7.075e- 
12 118-166 


74 


BL00790 


Receptor tyrosine kinase 
class V proteins. 


BL00790N 13.25 6.116e- 
10 93-120 


1 76 


DM00471 


0 PROKARYOT I C DNA 
TOPOISOMERASE I. 


DM00471A 11.73 9.357e- 
13 53-66 DM00471B 
8.45 4.857e-12 70-81 


! 80 


PD02876 


DECARBOXYLASE 
PHOSPHATIDYLSERINE . 


PD02876C 8.80 2.723e- 
13 223-236 PD02876D 
12.13 2.588e-12 334- 
351 


I 81 


PD02876 


DECARBOXYLASE 
PHOSPHATIDYLSERINE . 


PD02876C 8.80 2.723e- 
13 282-295 PD02876D 
12.13 2.588e-12 393- 
410 


83 


BIj00708 


Prolyl endopeptidase 
family serine proteins. 


BL00708B 24.91 7.197e- 
12 570-601 


1 84 


PR00014 


FIBRONECTIN TYPE III 
REPEAT SIGNATURE 


PR00014C 15.44 8.043e- 
09 985-1004 


86 


PR00678 


PI3 KINASE P85 
REGULATORY SUBUNIT 
SIGNATURE 


PR00678H 9.13 1.379e- 
09 246-269 


89 


PR00320 


G- PROTEIN BETA WD- 40 
REPEAT SIGNATURE 


PR00320C 13.01 8.200e- 
09 264-279 PR00320B 
12.19 8.650e-09 264- 
279 


93 


BL00455 


Putative AMP- binding 
domain proteins. 


BL00455 13.31 2.588e- 
14 316-332 


95 


BL00107 


Protein kinases ATP- 
binding region proteins. 


BL00107A 18.39 4 . OOOe- 
10 123-154 


96 


BL00107 


Protein kinases ATP- 
blnding region proteins. 


BL00107A 18.39 4.000e- 
10 212-243 


97 


PR00081 


GLUCOS E / RIB ITOL 
DEHYDROGENASE FAMILY 
SIGNATURE 


PR00081B 10.38 6.318e- 
13 134-146 PR00081A 
10.53 2.500e-12 54-72 


98 


.PR0O380 


KINESIN HEAVY CHAIN 
SIGNATURE 


PR00380A 14.18 5.500e- 
[ 24 401-423 PR00380D 
S.93 7.188e-20 613-635 
PR00380B 12.64 7.517e- 
16 529-547 PR00380C 
13.18 2.756e-13 560- 
S79 
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SEQ ID NO: 


ACCESSION 
NO 


DESCRIPTION 


RESULTS* 


102 


PR00300 


ATP- DEPENDENT CLP 
PROTEASE ATP -BINDING 
SUBUNIT SIGNATURE 


PR00300A 9.56 7.S45e- 
14 289-308 


104 


BL00479 


Phorbol esters / 
diacylglycerol binding 
domain proteins. 


BLG0479B 12 57 6 7flfip- 
18 298-314 BL00479A 
19.86 4.913e-16 155- 
178 BLO0479A 19.86 
4.300e-l3 272-295 
BL00479B 12.57 6.294e- 
12 181-197 


106 


BL01019 


ADP-ribosylation factors 
family proteins. 


BL01019A 13.20 8.013e- 
12 43-83 


107 


DM01970 


0 Jew ZK632.12 YDR313C 
ENDOSOMAL III. 


DM01970B 8.60 S.OOOe- 
16 403-416 


108 


BL00191 


Cytochrome b5 family, 
heme -binding domain 
Dioteins 


3L00191K 17.38 4.951e- 
27 238-282 BL00191J 
11 37 6 447e-17 182- 
204 


109 


PD01066 


PROTEIN ZINC FINGER 
BINDING NU. 


PD01066 19.43 4.938e- 


1 1 n 

J. A W 




proteins . 


10 38-50 


113 


BL00107 


Protein kinases ATP- 
Dinoing jregxon procems . 


BL00107A 18.39 5.800e- 
xiJ lio-lo/ obuuxu /J3 
13.31 9.100e-14 225- 


117 


BL00214 


Cytosolic fatty-acid 
Dinaing proceins . 


BL00214B 26.51 l.OOOe- 
21.17 7.052e-ll 5-31 


lit) 


1ST f\ f\ i n T 

dLUUIU / 


rXOCClU IwinaScS Air~ 

binding region proteins . 


13 36-67 


119 


PR00529 


GONADOTROPH IN RELEASING 
HORMONE RECEPTOR 
SIGNATURE 


PR00529C 11.03 7.506e- 
10 158-177 


120 


PRO 03 20 


G- PROTEIN BETA WD- 40 
REPEAT SIGNATURE 


PR00320C 13.01 9.400e- 
09 80-95 


121 


rati f \ n "> *■» a 


Lt — rKU i. .& J. W xJ£,i>i UiJj— *i U 

REPEAT SIGNATURE 


09 80-95 


127 


T4t Ann r* 

BLO 0215 


ni cocnonar iai energy 
transfer proteins . 


13 216-241 


128 


BL01032 


Protein phosphatase 2C 
proteins . 


BL01032C 6.14 3.195e- 
12 147-157 BL01032H 
11.25 5.680e-ll 318- 

8.932e-ll 282-296 
BL01032I 10.42 8 902e- 
09 379-389 


i 129 

*• ^ 

* 


BLO 1310 

U JLt \S JL *J ~L- \J 


ATP"»G1 / PLM / MAT8 

A\ * m% M m^ / & Jill * J * •* • • Vr 

family proteins. 


BL01310 14.74 6.694e- 
26 28-64 


130 


PR00990 


RIBOKINASE SIGNATURE 


PR00990B 12.32 9.534e- 
15 47-67 PR00990A 
16.23 5.500e-14 20-42 
PR00990C 12.62 2.412e- 
09 119-133 


133 


BL00880 


Acyl - CoA-binding 
protein. 


BL00880 17.52 5.57Se- 
26 72-122 


134 


BLO 003 0 


Eukarvotic RNA-bindinci 
region RNP-1 proteins. 


BL00030A 14.39 9.308e- 
14 18-37 


135 


PR00215 


NEUROMODULIN SIGNATURE 


PR00215C 13.98 6.779e- 
10 475-496 


13 6 


BL01310 


ATP1G1 / PLM / MATS 
family proteins. 


BL01310 14.74 2.432e- 
29 71-107 


140 


BL00028 


Zinc finger, C2H2 type, 
domain proteins. 


BL00028 16.07 7.882e- 
14 214-231 BL0O028 
16.07 9.47le-14 102- 
119 BL00028 16.07 
2.800e-13 18-35 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 






- 


BL00028 16.07 5.500e- 
13 74-91 BL00028 
16.07 9.100e-13 186- j 
203 BL00028 16.07 
8.043e-12 46-63 
BL00028 16.07 8.435e- 
12 130-147 BL00028 
16.07 9.217e-12 270- 
287 BL00028 16.07 
6.192e-ll 242-259 
BL00028 16.07 4.000e- 
10 158-175 


141 


BL00501 


Signal peptidases I j 
serine proteins. 


BL00501D 16.69 9-538e- 
14 113-133 BLOOS01C 
9.61 8.688e-10 89-101 


143 


BL01020 


SARI family proteins. 


BL01020C 15.35 7.722e- 
20 79-130 


146 


PD01066 


PROTEIN ZINC FINGER 
ZINC -FINGER METAL- 
BINDING NU. 


PD01066 19.43 6.400e- 
25 335-374 


149 


BL00126 


3 '5* -cyclic nucleotide 
phosphcdi esterases 
proteins . 


BL00126C 22.07 1.450e- 
25 509-550 BL00126E 
35.22 3 . 951e-16 654- 
709 BL00126D 25.50 
1.360e-15 565-604 j 
BL00126B 15.20 8.200e- 
11 483-495 BL00126A 
27.56 8.269e-ll 442- 
479 


151 


BL00632 


Ribosomal protein S4 
proteine . 


BL00632 23.79 5.271e- 
20 106-149 


154 


BL00559 


Eukaryotic molybdopterin 

oxidoreductases 

proteins. 


BL00559I 13.63 5.304e- 
19 29-58 BL00559K 
13.17 2.957e-18 172- 
199 BL005S9J 19.63 
8.385e-13 99-151 
BL005S9L 13.60 5.814e- 
12 241-259 


155 


PR00449 


TRANSFORMING PROTEIN P21 
RAS SIGNATURE 


PR00449A 13.20 1.6 92e- 
13 13-35 


157 


BL004 06 


Actins proteins. 


BL00406D 12.58 2.547e- 
18 275-330 BL00406A 
9.95 5.776e-16 15-50 
BL00406B 5.47 7.429e- 
12 69-124 BL00406C 
6.75 9.682e-12 128-183 


160 


BL00132 


Zinc carboxypeptidases, 
zinc-binding region l 
proteins - 


BL00132A 26.07 7 . OOOe- 
14 22-63 BL00132C 
21.35 3.466e~12 104- 
145 


165 


PR00109 

• 


TYROSINE KINASE 
CATALYTIC DOMAIN 
SIGNATURE 


PR00109B 12.27 9.043e- 
13 139-158 


168 


BL00362 


Ribosomal protein SI 5 
proteins. 


BL00362 24.67 9.700e- 
15 129-172 


169 


BL00039 


DEAD- box subfamily ATP- 
dependent he li cases 
proteins . 


BL00039D 21-67 1.000s- j 
35 640-686 BL00039A 

1 a A A 1 Q^d*=»-1^ 515 — " . 

251 BL00039B 19.19 
4.553e-13 378-404 
BL00039C 15.63 8.773e- 
12 465-489 


175 


PR004 49 


TRANSFORMING PROTEIN P21 
RAS SIGNATURE 


PR00449A 13.20 3.721e- 
12 14-36 


178 


BL01310 


ATP1G1 / PLM / MAT 8 
family proteins. 


BL01310 14.74 2.432e- 
29 133-169 


179 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL - 


PD01066 19.43 9.455e- [ 
36 6-45 
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SEQ ID NO: 


ACCESSION 
NO 


DESCRIPTION 


RESULTS* 






BINDING NU. 




180 


PR0O0O7 


COMPLEKEN" CIO DOMAIN 
SIGNATURE 

• 


t I\v wuU ' £> i. 4 , X D / , *lz JC 

20 160-180 PR00007A 

160 PR00007C 15.60 

I 225e-15 206-2?8 
PR00007D 9.64 6.885e- 

II 238-249 


181 


BL00027 


* Homeobox ' doraai n 
proteins . 


BL00027 25.43 9.526e- 
24 280-323 


182 


BL00027 


' Homeobox • domain 
proteins . 


BL00027 26.43 9.526e- 
24 263-306 


183 


BL00027 


'Honeobox* domain 
proteins. 


BL00027 26.43 9.526e- 
24 280-323 


184 


BL00027 


' Homeobox 1 domain 
proteins. 


BL00027 26.43 9.526e- 
24 263-306 


188 


PR00929 


AT- HOOK-LIKE DOMAIN 
SIGNATURE 


PR00929C 5.26 3.328e- 
09 460-471 


189 


PR00929 


AT -HOOK- LIKE DOMAIN 
SIGNATURE 


PR00929C 5.26 3.328e- 
09 440-451 


190 


BL003 83 


Tyrosine specific 
protein phosphatases 
proteins. 


BL00383F 15.51 7.188e- 
17 666-682 BL00383A 
13.34 8.714e-17 162- 
177 BL00383E 10.35 

I. 000e-14 333-344 
BL00383E 10.35 7.300e- 
14 628-639 BL00383P 
15.51 1.720e-13 371- 
387 BL00383C 10.10 
3.000e-13 217-228 
BL00383D 11.92 7.000e- 
13 295-308 BL00383B 
7.61 1.692e-ll 187-196 
BL00383C 10.10 1.7S0e- 
09 509-520 BL00383D 

II. 92 4.000e-09 589- 
602 BL00383B 7.61 
8.000e-09 479-488 


191 


PR00450 


RECOVER IN FAMILY 
S I GNATUrcE 


PR00450C 12.22 7.911e- 
15 83 — 10b PK004bUL. 
12.22 6.286e-13 47-69 


193 


PF00564 


Oc t i cos apep tide repeat 


PP00564B 24.74 6.l64e- 


194 


PRO 0503 


BROKODOMAIN SIGNATURE 


PR00503D 20.81 9.156e- 
9.96 9.571e-13 170-187 


JL J ~> 


RTiOOQm 

£>Uu u J u -L 


uyotciiit; 

synthase/cystathionine 
befca-Bvnthase P— 

phosphate att . 


18 67-117 


197 


BL00636 


Nt— dnaJ domain oroteinH 


BL00636A 8.07 6.2lle- 
17 4 0-57 BL00636B 
15.11 2.000e-13 67-88 


198 


PR00690 


ADHESIN FAMILY SIGNATURE 


PR00690A 10.86 9.866e- 
09 463-482 


199 


BL01131 


Riboaoraal RNA adenine 
dimethyl as es proteins . 


BL01131A 26.62 2.343e- 
12 84-130 


201 


PR00910 


LUTEOVI RUS OR P6 ^ROTE IN 
SIGNATURE 


PR00910A 2 51 8 352e- 
12 509-522 


203 


DM00215 


PROLINE-RICH PROTEIN 3. 


DM00215 19.43 2.286e- 
10 39-72 


206 


PR00261 


LOW DENSITY LIPOPROTEIN 
(LDL) RECEPTOR SIGNATURE 


PR00261A 11.02 4.462e- 
19 65-87 PR00261C 
11.37 9.308e-19 S5-87 
PR00261D 12.47 2.667e- 
18 65-87 PR00261B 
14.12 4.000e-18 143- 
165 PR00261A 11.02 



197 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 




* 




4.833e-18 143-165 
PR00261D 12.47 7.500e- 
18 143-165 PR00261B 
14:i2 5.065e-16 65-87 
PR00261C 11.37 8.967e- 
16 143-165 PR00261F 
11.57 4.938e-13 143- 
165 PR00261E 11.08 
7.188e-13 65-87 
PR00261F 11.57 7.188e- 
13 65-87 PR00261E 
11.08 1.643e-ll 143- 
165 


209 


PF00791 


Domain present in ZO-1 
and UncS-like netrin 
receptors. 


PF00791B 28.49 6-143e- 
13 118-173 PF00791C 
20.98 7.680e-10 132- 
171 


211 


PR00007 


COMPLEMENT C1Q DOMAIN 
SIGNATURE 

• 


PR00007A 19.33 5.731e- 
19 131-158 PR00007B 
14.16 4.115e-18 158- 
178 PR00007C 15.60 
1.675e-l5 201-223 
PR00007D 9.64 7.231e- 
11 233-244 


212 


BL00183 


Ubi qu i t in - con j uga t ing; 
enzymes proteins. 


BL00183 28.97 1.545e- 
30 43-91 


213 


BL00183 


Ubicxui t in-coniuaat incr 
enzymes proteins. 


BL00183 28.97 l.S45e- 
30 43-91 


215 


BL0003 9 

- 


npan.hftv subf amilv ATP— 
dependent he li cases 
proteins . 

i 


BL00039D 21.67 1.900e- 
29 568-614 BL00039A 
18.44 1.871e-23 21-60 
BL00039C 15.63 1.720e- 
11 354-388 BLO0039B 
19.19 4.064e-ll 277- 
3,03 


217 


BL00100 


Chloramphenicol 
acetyl transferase 
proteins . 


BL00100D 17.22 8.484e- 
09 68-106 


219 


PR00213 


MYELIN P0 PROTEIN 
SIGNATURE 


FR00213C 15.94 3.969e- 
11 199-227 


222 


BL00678 


Trp-Asp (WD.) repeat 
proteins proteins . 


BL00678 9.67 1.947e-09 
144-155 


224 


PR00875 


MOLLUSC METALLOTHIONEIN 
SIGNATURE 


PR00875A 5.83 l.OOOe- 
09 901-913 


225 


BL0 0636 


Nt-dnatf domain proteins. 


BL00636B 15.11 8.20Oe- 
19 18-39 


226 


BL00636 


Nt-dnaJ domain proteins. 


BL00636A 8.07 1.000c- 
21 21-38 BL00636B 
15.11 8.200e-19 45-66 


229 


PR00301 


70 KD HEAT SHOCK PROTEIN 
SIGNATURE 

* 


PR00301F 13.98 7.563e- 
13 329-346 PR00301G * 
13.78 4.300e-12 361- 
382 


230 


BL0 0460 


Glutathione peroxidases 
selenocysteine proteins. 


BL00460A 28.67 8.773e- 
20 35-70 BLO0460B 
9.73 7.429e-16 78-96 
BL00460C 14.35 2.831e- 
12 111-134 BL00460D 
16.89 8.773e-ll 140- 
160 


231 


PR00647 


SENR ORPHAN RECEPTOR 
SIGNATURE 

• 


PR0O647B 10.19 8.522e- 
09 273-287 


233 


BL00292 


Cyclins proteins. 


BL00292B 20.31 7.429e- 
27 244-275 BL00292A 
22.87 7.750e-27 201- 
235 


234 


PR0044 9 


TRANSFORMING PROTEIN P21 
RAS SIGNATURE 


PR00449A 13.20 6.308e- 
13 7-29 PR00449C 
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SEQ ID NO: 


ACCESSION 


DESCRIPTION 


RESULTS* 




NO. 












17.27 4.462e-ll 47-70 








PR00449D 10.79 7.120e- 








11 109-123 


235 


PR00019 


LEUCINE- RICH REPEAT 


PR00019B 11.36 7.300e- 






SIGNATURE 


10 251-265 PR00019B 








11.36 5.320e-09 119- 








133 PRO 001 9B 11.36 








1.000e-08 229-243 


^ ^ 

236 




Xi&UCXNC^KlCH REPEAT 


PR00019B 11.36 7.300e~ 






SIGNATURE 


10 245-259 PR00019B 








11.36 5.320e-09 113- 








127 PR00019B 11.36 








1.000e-08 223-237 j 


237 


PD00289 


PROTEIN SH3 DOMAIN 


PD00289 9.97 8.448e-09 \ 






REPiAT PRESYNA . 


A 

o7-8l 


240 


PRO 0011 


TYPE III EGF-LIKE 


PR00011D 14.03 3.492e- 






SIGNATURE 


10 616-635 


241 


PRO 00 11 


TYPE III EOF- LIKE 


PR00011D 14.03 3.492e- 






SIGNATURE 


10 616-635 


244 


BL00903 


Cytidir.e and 


BLO0903 12.93 8.941e- 






deoxycytidyla te 


12 54-64 






deaminases zinc -binding 


• 






region s. 


* 


245 


DM00179 


w KINASE ALPHA ADHESION 


DM00179 13.97 8.043e- 






T-CELL. 


09 124-134 


248 


BL00246 


Wnt-1 family proteins. 


BL00246D 23.97 l.OOOe- 








40 186-239 BL00246E 








20.32 1.000e-40 305- 








351 BL00246B 13.69 








4.176e-36 105-140 








BL00246A 15.75 2.286e- 








24 70-90 BL00246C 








15.56 4.857e-22 150- 








175 


250 


PR00927 


ADENINE NUCLEOTIDE 


PR00927E 14.93 5 . 114e- . 






TRANS LOCATOR 1 SIGNATURE 


10 253-275 


254 


BL00674 


AAA -protein family 


BL00674B 4.46 l.OOOe- 






proteins . 


09 223-245 


255 


PD01796 


PROTEIN TRANSMEMBRANE 


PD01796 15.01 6.045e- 






COBALT ZINC CADMIU. 


09 61-88 


255 


BL50002 


Src homology 3 (SH3) 


BL50002B 15.18 2.800e- 






domain proteins profile . 


10 421-435 


259 


PR00094 


ADENYLATE KINASE 


PR00094C 12.94 2.200e- 






SIGNATURE 


18 87-104 PR00094D 








12.52 2.731e-14 161- 








177 PR00094A 10.31 








5.500e-14 11-25 








PR00094B 11.01 4.115e- 








13 39-54 PR00094E 






- 


11.25 7.333e-13 178- 








193 


259 


BL00892 


HIT family proteins; 


BL00892A 18.17 5.500e- 








13 60-91 


262 


BL00388 


Proteasotne A- type 


BL00388A 23.14 l.OOOe- 






subunits proteins . 


40 8-54 BL0038BB 








31.38 3.864e-33 66-108 








BL00388D 20.71 l.OOOe- 








21 153-184 BLUOJHoL 








18.79 8.147e-16 126- 








148 


264 


BL00903 


Cytidine and 


BL00903 12.93 5.821e- 






deoxycytidyla te 


09 91-101 






deaminases zinc-binding 








region s. 




267 


BLO0107 


Protein kinases ATP- 


BL00107B 13.31 1.529e- 






binding region proteins* 


09 241-257 


270 ■ 


BL00226 


Intermediate filaments 


BL00226D 19.10 l.OOOe- 






proteins. 


37 362-409 BL00226B 



> 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








23.86 8.043e-35 196- 
244 BL00226C 13.23 
7.000e-20 261-292 
BL00226A 12.77 6.143e- 
15 96-111 


271 i 


PD02952 


KINASE TRANSFERASE 
CHOLINE PROTEIN 
MULTIGENE FAMI. 


PD02952C 15.76 9.731e- 
16 235-265 PD029S2B 
15.57 5.625e-09 215- 
229 


272 


PD02929 


ADHESION GLYCOPROTEIN 
PRECURSOR I. 


PD02929A 28.27 l.OOOe- 
40 106-160 PD02929B 
18.36 8.800e-17 179- 
199 


274 1 


BL01027 


Glycosyl hydrolases 
family 39 proteins. 


BL01027B 15.34 3.486e- 
09 213-250 


275 


PR00424 


ADENOSINE RECEPTOR 
SIGNATURE 


PR00424D 14.32 6.451e- 
11 39-59 


z / / 


RL00052 

\ 


Ribosomal protein S7 
proteins . 


BL00052A 27.85 6.000e- 
13 137-184 BL00052B 
15.17 5.143e-12 208- 
235 


279 


BL007 90 


Receptor tyrosine kinase 
class V proteins . 


BL00790N 13.25 5.659e- 
13 267-294 


280 


PR00319 


(TRANSDUCIN) SIGNATURE 


PH00319D 11 64 6.625e- 
23 107-125 PR00319C 
13 41 1 000e-21 89-105 

m ^» mm m \# mm- m+ mm — ^ mr ^mr — ■ 

PR00319A 15.27 8.364e- 
21 51-68 PR00319B 
11.47 8.200e-l9 70-85 


281 


r»o nnt 1 o 
f KUU j 17 


Pa Ail rAvluiif 

(TRANSDUCIN) SIGNATURE 

• 


PR00319D 11.64 6.625e- 

^ AW mw ^m* mm^ mm mm W m* m m 

23 94-112 PRO0319C 
13.41 1.000e-21 76-92 
PR00319A 15.27 8.364e- 
21 38-55 PR00319B 
11.47 8.200e-l9 57-72 


287 


PF00929 


Exonuclease . 


PF00929D 16.17 7.366e- 
09 149-163 


291 


BL00326 


Tropomyosins proteins. 


BL00326A 14.01 2.3S0e- 
09 93-127 


292 


BL003 26 


Tropomyosins proteins. 


BL00326A 14.01 2.360e- 
09 93-127 


294 


PD00066 


PROTEIN ZINC- FINGER 
METAL -BIND I. 


PD00066 13.92 8.714e- 
12 203-216 


295 


BL00028 


Zinc finger, C2H2 type, 
domain proteins. 


BL00028 16.07 S.SOOe- 
15 322-339 BL00028 
16.07 9.471e-14 4*33- 
450 BL00028 16.07 
4 .600e-13 648-665 
BL00028 16.07 5.500e- 
13 760-777 BL00028 
16.07 9.550e-13 788- 
805 BL00028 16.07 
3.348e-12 704-721 
BL00028 16.07 6.478e- 
12 461-478 BL0002B 
16.07 8.435e-12 844- 
861 BL00028 16.07 
1.692e-ll 593-610 
BL00028 16.07 2.038e- 
11 211-228 BL00028 
16.07 5.1S4e-ll 732- 
749 BL00028 16.07 
5.846e-ll 377-394 
BL00028 16.07 6.885e- 
11 816-833 BL00028 
16.07 7.231e-ll 676- 
693 BL00028 16.07 
9.654e-ll 564-581 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








BL00028 16.07 4 . 086e- 

31 / OliUUUzO 

16.07 7.429e-09 489- 
506 


296 


BL0021S 


Mitochondrial energy 
transfer proteins. 


BL00215A 15.82 8.333e- 
16 111-13 6 BL00215A 

2.723e-ll 10-35 
BL00215B 10.44 9.526e- 
11 152-165 BL00215B 
10.44 7.375e-10 59-72 
BL00215A 15.82 9.824e- 
10 20S-230 


302 


PF00953 


Glycosyl transferase. 

* 


PF00953C 19.70 8.773e- 
34 23©-2o9 Pr00953A 
19.68 5.000e-25 102- 
129 PF00953B 6.17 


304 


PF00152 


tRNA synthetases class 

T T 


PF00152D 21.30 8.364e- 

2o 422-46X rr UUlb^L 

28.03 9.250e-21 220- 

2.6S8e-13 159-184 
PF00152A 19.68 5.714e- 
11 44-67 


305 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 
BINDING NU. 


PD01066 19.43 8 . 250e- 
35 37-76 


305 


PD02784 


PROTEIN NUCLEAR 
RIBONUCLBOPROTEIN . 


PD02784B 26.46 5.840e- 
09 92-135 


307 


PRO 04 54 


ETS DOMAIN SIGNATURE 


PR00454C 11.24 7.808e- 
09 1167-1186 


308 


PRO 02 3 7 


RHODOPSIN-LIKE GPCR 
SUP ERFAMI IVY SIGNATURE 


PR00237E 13.03 5-091e- 
13 188-212 PR00237G 
19.63 7.207e-13 268- 
295 PR00237A 11.48 
4.375e-ll 24-49 
PR00237C 15.69 3.057e- 
10 101-124 PR00237D 
8.94 4.7S0e-10 137-159 
PR00237F 13.57 5.364e- 
10 230-255 PR0 023 7B 
13. SO 9.438e-10 57-79 \ 


o r\ n 

309 


BL00522 


DNA ■ polymerase rami J. y x 
proteins. 


BL00522C 11.au / . 5 / /e- 
24 315-339 BL00522F 
14.90 1.310e-15 470- 
4 94 BL00522A 25.52 
jl » z 0 z>e — jl. 4 X/ji— zzo 
BL00522E~*19.63 8.615e- 
14 430-460 BL0052.2B 
27 •an Q K7^f»-"12 267 — 

313 


310 

J X W 

* 


BL00326 




BL00326D 8 76 5 235e- 
10 856-897 






major histocompatibility 


14 151-174 BL00290B 
13 17 9 000e-12 211- 
229 


313 


BL00345 


Ets- domain proteins. 


BL00345B 21.28 1 . OOOe- 
40 34-85 BL00345A 
13.96 9.217e-16 1-20 


315 


PF00651 


BTB (also known as BR- 
C/Ttk) domain proteins. 


PF00651 15.0O 5.09-le- 
15 63-76 


317 


BL01020 


SARI family proteins. 


BL01020C 15.35 3.198e- 
17 79-130 


318 


BL00216 


Sugar transport 
proteins. 


BL00216B 27.64 4.696e- 
11 164-214 


320 


PR00109 


TYROSINE KINASE 

CATALYTIC DOMAIN j 


PR00109B 12.27 4 . 814e- 
10 216-235 
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SBQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 






SIGNATURE 




321 


BL00027 


» Homeobox • domain 
proteins . 


BL00027 26.43 S.688e- 
10 329-372 


322 


PR00109 


TYROSINE KINASE 
CATALYTIC DOMAIN 
SIGNATURE 


PR00109B 12.27 8.765e- 
12 558-577 


324 I 


BL01241 


Link domain proteins. 


BL01241 35.81 8.313e- 
30 183-236 BL01241 
35.81 3.222C-13 282- 
335 


326 


BL00412 


Neuromodulin (GAP- 43) 
proteins . 


BL00412D 16.54 4.000e- 
12 515-566 BL00412D 
16.54 5.705e-ll 516- 
567 BL00412D 16.54 
7.848e-10 518-569 
BL00412D 16.54 1.827e- 
09 514-565 BL00412D 
16.54 1.918e-09 513- 
564 BL00412D 16.54 
2.102e-09 520-571 


328 


BL00232 


Cadherins extracellular 
repeat proteins domain 
proteins . 


BL00232B 32.79 9.557e- 
20 151-199 BL00232B 
32.79 2.246e-18 41-89 
BL00232B 32.79 5.985e- 
18 370-418 BL00232B 
32.79 5.500e-16 258- 
306 BL00232B 32.79 
9.384e-15 475-523 
BL00232C 10.65 2 . 537e- 
12 256-274 BL00232C 
10.65 4.325e-ll 368- 
386 BL00232C 10.65 
7.261e-ll 473-491 
BL00232C 10.65 7.457e- 
11 39-57 


330 


PR004S4 


K'l'S DOMAIN SIGNATURE 


PR00454C 11.24 7.808e- 
09 1167-1186 


331 


BL00598 


Chrotno domain proteins. 


BL00598 14.45 8.393e- 
18 27-49 


333 


BL01016 


Glycoprotease family 
proteins . 

• 


BL01016C 22.84 3 . 925e- 
32 70-115 BL01016E 
14.88 5.286e-19 149- 
177 BL01016H 13.71 
7.577e-13 291-301 
BL01016D 8.86 3.298e- 
11 127-140 BL01016G 
7.14 5.622e-10 261-271 
BL01016A 5. 65 7.167e- 
10 4-19 BL01016F 
13.34 1.563e-09 200- 
212 BL01016B 8.93 
8.855e-09 38-50 


339 


BL01115 


GTP- binding nuclear 
protein ran proteins . 


BL01115A 10.22 S.SOOe- 
11 17-61 


340 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL - 
BINDING NU. 


PD01066 19.43 1.231e- 
33 10-49 


341 


BL01160 


Kinesin light chain 
repeat proteins. 


BL01160B 19.34 5.042e- 
OS 55-109 


342 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 
BINDING NO . 


PD01066 19.43 2.400e- 
30 16-5S 


343 


DM00031 


IMMUNOGLOBULIN V REGION. 


DM00031A 16.80 l.OOOe- 
40 20-68 


346 


PR00109 


TYROSINE KINASE 
CATALYTIC DOMAIN 
SIGNATURE 


PR00109B 12.27 4.764e- 
11 135-154 


347 


PRO 010 9 


TYROSINE KINASE 


PR00109B 12.27 4.764e- 
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SKU ID NO: 


1 ACCESSION 
NO. 


DESCRIPTION 


| RESULTS* 






CATALYTIC DOMAIN 


1 11 135-154 






SIGNATURE 




351 


1 BL01187 


Calcium- binding EGF-like 
domain proteins pattern 
proteins . 


BL01187B 12.04 1.783e- 
13 100-116 BL01187B 
1 12-04 8.435e-13 276- 
1 292 BL01187B 12.04 

8.800e-ll 13-29 
1 BL01187B 12.04 7.429e- 
1 10 54-70 BL01187B 
1 12 . 04 5.725e-09 231- 
j 247 BL01187A 9.98 
1 7.000e-09 255-267 


3S2 


PD00078 


REPEAT PROTEIN ANK 
NUCLEAR ANTCYR. 


PD00078B 13.14 5.950e- 
J 10 366-379 PD00078B 

13.14 4.522e-09 168- 
1 181 


T C A 

354 


BL00380 


Rhodanese proteins. 


j BL00380F 9.76 6.694e- 
1 11 542-553 


35S 


PF00628 


PHD-f inger . 


j PF00628 15.84 l.OOOe- 
| 11 116-131 


356 


PR00587 


SOMATOSTATIN RECEPTOR 
TYPE 1 SIGNATURE 


PR00587A 8.06 9.700e- 
j 09 17-37 


353 


PD00066 


PROTEIN ZINC- FINGER 
METAL- BIND I. 


PD00066 13.92 4.462e- 
15 261-274 PD00065 
13 . 92 6 .500e-13 233- 
246 PD00066 13,92 
| 4 . 300e-09 289-302 


361 


PF00791 


Domain present in ZO-1 
and Unc5-like netrin 
receptors . 


j PF00791B 28.49 9.604e- 
' 13 54-109 PF00791B 
28.49 1.095e-12 21-76 
PF00791A 27.85 1.432e- 
09 71-126 PF00791B 
I 28.49 7.440e-09 184- 
239 


362 


PF00791 


Domain present in ZO-l 
and Unc5-like netrin 
receptors . 


PF00791B 28.49 2.273e- 
11 279-334 


363 


PR00450 


RECOVERIN FAMILY 
SIGNATURE 


PR00450C 12.22 5.080e- 
10 73-95 PR00450C 
12.22 3.278e-09 109- 
131 


364 i 


PF00242 


DNA polymerase (viral) | 
N- terminal domain 
proteins. ] 


PF00242Q 13.51 2.328e- 
09 22-68 


365 | 


PF00242 


DNA polymerase (viral ) 1 
N- terminal domain j 
proteins . | 


PF00242Q 13.51 2.328e- 
09 22-68 


366 


BL01160 


Kinesin light chain J 
repeat proteins. j 


BL01160B 19.54 6.644e- 
09 1038-1092 


367 j 


PR00019 

* 


LEUCINE- RICH REPEAT j 
SIGNATURE I 


PR00019B 11.36 1.360e- 
09 229-243 PR00019B 
11.36 6.040e-09 91-105 
PR00019A 11.19 8.667e- 
09 370-384 


368 | 


PR00011 


TYPE III EGF-LIKE | 
SIGNATURE j 


PR00011D 14.03 9-OO0e- 
15 30-49 PROOOllA 
14.06 9.830e-15 30-49 
PR00011B 13.08 4.500e- 
14 30-49 PR00011C 
24.25 5.143e-09 6-35 


369 ! 


BL01032 


Protein phosphatase 2C 
proteins . j 


BL01032H 11.25 4.1S0e- 
09 417-430 


372 


BL00478 


LIM domain proteins. 


BL00478B 14.79 7.750e- 
12 410-425 


373 


PDO 10 66 


PROTEIN ZINC FINGER j 
ZINC- FINGER METAL- 1 
BINDING NU. | 


PD01066 19.43 9.757e- 
34 26-65 


376 


PR00170 


SODIUM CHANNEL SIGNATURE f 


PR00170E 6.4 8 2.739e- 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








10 88-118 


380 


BL00107 


Protein kinases ATP- 
binding region proteins. 


BL00107A 18.39 1 . 000e- 
23 276-307 BL00107B 
13.31 1.692e-12 342- 
358 


381 


BL004S5 


Putative AMP -binding 
domain proteins. 


BL00455 13.31 S . 714e- 
12 50-66 


382 


PRO 06 24 


HI STONE H5 SIGNATURE 


PR00624G 4.08 4 . 900e- 
09 524-544 


384 


PD00078 


REPEAT PROTEIN ANK 
NUCLEAR ANKYR. 


PD00078B 13.14 5.950e- 
10 366-379 PD00078B 
13.14 4.522e-09 168- 
181 


385 


PRO 05 11 


TEKTIN SIGNATURE 


PR00511D 7.11 5.371e- 
09 67-80 


386 


PD02870 


RECEPTOR INTERT.EUKIN-1 
PRECURSOR . 


PD02870B 18.83 6.000e- 
10 97-130 


383 


PD00066 


PROTEIN ZINC- FINGER 
METAL- DINDI. 


PD00066 13.92 S.OOOe- 
13 516-529 


3 83 


BL00290 


Immunoglobulins and 
major histocompatibility 
complex proteins. 


BL00290A 20.89 7.657e- 
09 151-174 


350 


BL0021S 


Mitochondrial energy- 
transfer proteins. 


BL00215A 15.82 S.200e- 
15 221-246 BL00215A 
15.82 7.618e-14 20-45 
BL00215A 15.82 8.851e- 
11 123-148 BL00215B 
10.44 9.526e-ll 69-82 
BL00215B 10.44 7.300e- 
09 272-285 BL00215B 
10.44 8.500e-09 165- 
178 


394 


BL00674 


AAA-protein family 
proteins. 


BL00674B 4.46 2 . 723e- 
16 299-321 


397 


PR00048 


C2H2-TYPE ZINC FINGER 
SIGNATURE 


PR00048A 10. S2 8.579e- 
11 141-155 


398 


PRO 07 61 


BIND IN PRECURSOR 
SIGNATURE 


PR00761B 9.93 6.764e- 
09 S5-74 


399 


BL00240 


Receptor tyrosine kinase 
class III proteins. 


BL00240B 24.70 7.907e- 
10 118-142 


.401 


PFO0676 


Dehydrogenase El 
component . 


PF00676B 24.71 8 . 071e- 
18 331-369 PF00676D 
14.40 3.854e-lS 486- 
506 PF00676C 16.88 
9.182e-14 454-478 


402 

* 


BL00514 


Fibrinogen beta and 
gamma chains C- terminal 
domain proteins. 


BL00S14C 17.41 4.673e- 
28 4432-4469 BL00514G 
15.98 6.092e-14 4555- 
4585 BL00514D 15.35 
2.532e-12 4473-4486 
BL00514F 11.65 4.288e- 
10 4519-4534 BL00514H 
14.95 4.955e-10 4584- 
4609 


403 


PF00992 


Troponin . 


FF00992A 16.67 5.974e- 
09 105-140 


404 


PRO 00 19 


LEUCINE - RICH REPEAT 
SIGNATURE 


PRO 0019B 11-36 1.450e- 
10 73-87 PR00019A 
11.19 8.043e-10 76-90 
PR00019B 11.36 l.OOOe- 
09 50-64 PR00019B 
11.36 1.000e-09 96-110 


405 


BL00232 


Cadherins extracellular 
repeat proteins domain 
proteins . 


BL00232B 32.79 9.557e- 
20 139-187 BL00232B 
32.79 2.246e-18 29-77 
BL00232B 32.79 5.985e- 
18 358-406 BL00232B 
32.79 5.500e-16 246- 
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ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 


- 

• 




• 


294 BL00232B 32.79 
9.384e-15 463-511 
BL00232C 10.65 2.537e- 
12 244-262 BL00232C 
10.65 4.326e-ll 356- 
374 BL00232C 10. 
7.26le-ll 461-479 

11 27-45 


407 


PF00426 


Outer Capsid protein. VP4 
{Hemagglutinin) . 


09 902-940 


409 


BL01160 


Kinesin light chain 
repeat proteins. 


BL01160B 19.54 9.695e- 
09 126-180 


410 


BL00741 


Guanine -nucleotide 
dissociation stimulators 
CDC24 family sign. 


BL00741B 14.2/ 2.73JLe- 
09 252-275 


411 


PF00646 


F-box domain proteins. 


PF00646A 14.37 6.344e- 
09 86-100 


412 


BL00603 


Thymidine kinase 
cellular- type proteins. 


BL00603B 11.39 8.500e- 
09 542-557 


415 


BL00866 


Carbamoyl -phosphate 
synthase subdomain 
proteins . 


BL00866B 36.29 3.571e- 
31 245-291 BL00866C 
23.26 9.000e-25 331- 
366 


418 


PR00239 


MOLLUSCAN RHODOPSIN C- 
TERMINAL TAIL SIGNATURE 


PR00239E 1.58 6.114e- 
09 590-602 


421 


PF00791 

• 


Domain present in ZO-1 
and Unc5-like netrin 
receptors . 


PF00791B 28.49 7 . 9S5e- 
14 23-78 PF00791B 
28.49 3.653e-12 273- 
328 PF00791B 28.49 
4.273e-ll 156-211 
PF00791B 28.49 7.818e- 
11 89-144 PF00791B 
28.49 1.524e-10 56-111 
PP00791C 20.98 3.559e- 
09 37-76 PF00791C 
20.98 5.235e-09 170- 
209 PF00791C 20.98 
5.235e-09 381-420 
PF00791B 28.49 6.202e- 
09 189-244 PF00791B 
28.49 7.028e-09 435- 
490 PF00791B 28.49 
8.679e-09 367-422 


424 


DM00892 


3 RETROVIRAL PROTEINASE. 


DM00892C 23.55 7.207e- 
28 1645-1679 


425 


PR00109 


TYROSINE KINASE 
CATALYTIC DOMAIN 
SIGNATURE 


PR00109D 17.04 5 . 881e- 
10 228-251 


429 


BL00S18 


Zinc finger, C3HC4 type 
(RING finger), proteins. 


BL00518 12.23 4.ouoe- 
11 31-40 


431 


BL00039 


DEAD-box subfamily ATP- 
dependent helicases 
proteins . 


BL00039D 21.67 1.844e- 
34 490-536 BL00039A 
18.44 5.615e-19 205- 
244 BLiO0O3yti j.y . 
8.920e-l6 251-277 
BL00039C _5 . 63 5.78ie- 
15 333-357 


432 


PR00452 


SH3 DOMAIN SIGNATURE 


PR00452B 11.65 7.652e- 
12 169-185 


433 


PR00828 


FORMIN SIGNATURE 


PR0082BB 5.23 8.218e- 
10 382-405 


436 


BL00415 


Synapsins proteins. 


BL00415N 4.29 8.643e- 
11 195-239 BL00415N 
4.29 3.036e-09 809-853 


443 


PR00834 


HTRA/DEGQ PROTEASE 
FAMILY SIGNATURE 


PR00834F 10.91 6.040e- 
11 221-234 


446 


PP01140 


Matrix protein (MA) , 


PF01140D 15.54 9.663e- 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 






P 15. 


10 183-218 PF01140D 
15.54 3.093e-09 246- 
281 


449 


PR00568 


DOPAMINE D3 RECEPTOR 
SIGNATURE 


PRC0568G 13.95 5.551e- 
09 39-53 


451 


PF00084 


Sushi, domain proteins 
{SCR repeat proteins. 


PF00084B 9.45 3 . 813e- 
10 47-59 


452 


BL00790 


Receptor tyrosine kinase 
class V proteins . 


BL00790I 20.01 2.821e- 
09 618-649 


456 


PR0038C 


KINESIN HEAVY CHAIN 
SIGNATURE 


PR00380A 14.18 l.OOOe- 
25 77-99 PR0O38OD 
9.93 1.000e-21 281-303 
PR00380C 13.18 8.286e- 
17 230-249 PR00380B 
12.64 4.724e-l6 194- 
212 


457 


PR00253 


GAMMA-AMINOBUTYRIC ACID 
(GAB A) RECEPTOR 
SIGNATURE 


PR00253A 9.15 9.143e- 
24 246-267 PR00253B 1 
13.47 2.000e-23 272- 
294 PR00253C 13.85 
7.000e-23 306-328 
PR00253D 16.68 5.950e- 
21 452-473 


467 


PR00849 


GLYCOSYL HYDROLASE 
FAMILY 58 SIGNATURE 


PR00849D 9.77 9.236e- 
09 910-937 


471 


BL00678 


Trp-Asp (WD) repeat 
proteins proteins . 


BL00678 9.67 8.200e-12 
33-44 


472 


BL00226 


Intermediate filaments 
proteins . 


BL00226B 23.86 3.721e- 
09 282-330 


473 


BL00344 


GATA- type zinc finger 
domain proteins. 


BL00344 17.99 7.000e- 
12 814-852 


474 


BL00481 


Thiol -activated 
cytolysins proteins. 


BL00481E 13.07 8 . 909e- 
09 173-199 


479 


PR00319 


BETA G- PROTEIN 
( TRANSDUC IN) SIGNATURE 


PR00319B 11.47 2-571e- 
09 393-408 


480 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 
BINDING NU. 


PD01066 19.43 1.90Oe- 
38 8-47 


481 


PR00405 


HIV REV INTERACTING 
PROTEIN SIGNATURE 


PR00405C 19.41 1 . OOOe- 
19 451-473 PR00405B 
11.83 4.333e-18 430- 
44 8 PR0040SA 17.71 
4 .971e-18 411-431 


482 


PR00049 


WILN'S TUMOUR PROTEIN 
SIGNATURE 


PR00049D 0.00 9.2 86e- 
10 959-974 PR00049D 
0.00 9.857e-10 958-973 
PR00049D 0.00 1.305e- 
09 937-952 PR00049D 
0.00 8.322e-09 939-954 


486 

f 


PR00007 


COMPLEMENT C1Q DOMAIN 
SIGNATURE 


PR00007B 14.16 8 . 615e- 
23 653-673 PRO00O7A 
19.33 6.192e-22 626- 
653 PR00007C 15.60 
5.846e-19 698-720 
PR00007D 9.64 3.647e- 
13 732-743 


487 


PD00S67 


PROTEIN RNA- BINDING RNA 
REPEAT HYD . 


PD00567B 18.23 2.853e- 


488 


PR00988 


URIDINE KINASE SIGNATURE 


PR00988A 6.39 4.569e- 
12 3-21 


489 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 
BINDING NU. 


PD01066 19.43 4.882e- 
27 30-69 PD01066 
19.43 3.430e-10 71-110 


490 


PR00049 

* 


WILM'S TUMOUR PROTEIN 
SIGNATURE 


PR00049D 0.00 7.864e- 
09 663-678 


492 


BL01128 


ShiJcimate kinase 
proteins . 


BL01128A 18.84 6.464e- 
17 58-92 


497 


PF00429 


ENV polyprotein (coat 


PF00429 31.08 7.l71e- 
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SEQ ID NO: 

• 


ACCESSION 
NO. 


DESCRIPTION 




* 




oolvorotein ) 


13 * X — / J. 


498 


BL00120 


Lipases, serine 
proteins. 


BL00120B 11.37 7.923e- 
09 185-200 


500 


BL0003 0 


Eukaryotic RN A- binding 
reoion RNP-1 Droteins 


BL00030A 14.39 7.353e- 


501 


BL01159 


WW/rsp5/WWP domain 
Droteins . 


BL01159 13.85 8.579e- 


505 


BL00021 


Kr ingle domain proteins. 


BL00021B 13.33 3.739e- 
17 492-510 


508 


PR00120 


H+TRANS PORTING ATPASE 
(PROTON PDMP) SIGNATURE 


PR00120C 9.90 5.800e- 
19 705-722 


50 9 


DM01417 

m 


6 kw INDUCING XPMC2 
MUSHROOM SPAC22G7.04. 


DM01417E 20.62 2.938e- 
16 362-395 DM01417D 
11.08 3.800e-13 322- 
336 


510 


PF00S34 


Glycosyl transferases 
group 1 . 


PF00534B 14.47 6.625e- 
09 346-370 


511 


PF00534 


Glycosyl transferases 
group i. 


PF00534B 14.47 6.62Se- 
09 293-317 


512 


PF00534 


Glycosyl transferases 
group 1 . 


PF00534B 14.47 6.62Se- 
09 366-390 


513 

• 


PD01841 


PH0SPHORYLASE KINASE 
ALPHA MUSCL. 


PD01841A 21.71 l.OOOe- 
40 110-160 PD01841B 
14.35 1.000e-40 181- 
222 PD01841D 17.87 
1.000e-40 243-295 
PD01841F 13-36 l.OOOe- 
40 333-382 PD01841G 
24.26 1.000e-40 386- 
440 PD01841L 18.42 \ 
1.000e-40 968-1010 
PD01841I 23.00 4.545e- 
37 762-804 PD01841E 
IB. 60 3.750e-36 295- 
333 PD01841J 14.94 
6.023e-35 851-888 
PD01841H 21.30 2 . 909e- 
33 490-527 PD01841K 
14.81 7.088e-33 924- 
954 PD01841C 13 . 78 
9.3B6e-23 222-243 
PD01841M 10.82 8.594e- 
21 1054-1073 PD01841I 

9"i t\l\ 9 ££*Ta_1 \ CAQj 

^^•uu £ too a*:*— | 
591 


514 


PR00153 


CYCLOPHILIN PEPTIDYL- 
ISOMERASE SIGNATURE 


PR00153C 11.01 7.188e- 

-LJ ?D"111 r KUul J J 

9.10 4.150e-12 122-138 


515 


BL00740 


MAM domain proteins. 


BL00740A 13.87 7.188e- 
12 410-423 


516 






12 1018-1052 


517 


BL00242 


Integrins alpha chain 
proteins. 


BL0O242C 16.36 8.320e- 
09 12-42 


~> ^- J 






UMOUUilA 10 . OU j. /sue— 
39 20-68 DM00031B 
15.41 1.000e-25 84-118 


525 


BT.00319 


glycoprotein 
extracellular domain 
proteins . 


RT.rtmi Qf 1 1 "19 ft ~a T cr _ 

DiiUUJl X/ . X*i O.J / JC 

10 61-95 


526 


PF00789 


Domain present in 
ubiqui tin-regulatory 
proteins . 


PF00789B 19.70 3.308e- 
12 322-343 PF00789C 
20.98 S.269e-09 367- 
392 


528 


BL01162 ! 


Quinone oxidoreductase / 

zeta-crystallin 

proteins. 


BL01162C 22.80 1.500e- 
16 120-164 
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ACCESSION 

• 

NO. 


DESCRIPTION 


RESULTS* 


529 


PR00910 


LUTEOVIRUS ORF6 PROTEIN 
SIGNATURE 


PR00910A 2.51 3.893e- 
09 60-73 


532 


BL0021S 


Mitochondrial energy 
transfer proteins. 


BL00215A 15.82 4.000e- 
17 11-36 BL00215A 
15.82 8.660e-ll 123- 
14 8 


533 


BL00215 


Mitochondrial energy 
transfer proteins . 


BL00215A 15.82 4.000e- 
17 11-36 BL00215A 
15.82 8.660e-ll 97-122 


534 | 


BL0009B 


Thiolases acyl -enzyme 
intermediate proteins. 


BL00098C 21.65 2.800e- 
38 181-227 BL00098B 
32.59 5.345e-38 86-141 
BL00098D 26.30 8.364e- 
35 245-286 BL00098E 
22.12 1.000e-34 314- 
352 BL00098F 10.18 
4.971e-22 365-386 
BL0009BA 10.60 6.455e- 
11 38-50 


535 


PRO037O 


FLAVIN - CONTAINING 
MQNOOXYGENASE (FMO) 
SIGNATURE 


PR00370E 11.96 7.429e- 
22 321-340 PR00370D 
16.33 6-143e-21 185- 
204 PR00370F 17.75 
6.559e-21 376-396 
PR00370B 10.91 9.591e- 
21 27-46 PR00370C 
12.72 3.500e-20 140- 
157 PR00370A 3-3S 
6.442e-lv 4-2D 


536 

• 


BL00028 


Zinc linger, \—driz cype, 
domain proteins. 


ClT.rinfl2fl "Lfi 07 7 429e- 
16 285-302 BL00028 
16 07 6 294e-14 341- 
358 BL00028 16.07 
l.346e-ll 369-386 
BL0C028 16.07 1.692e- 
11 397-414 BL00028 
16.07 4.4S2e-ll 453- 
470 BL00028 16.07 
7.231e-ll 425-442 
BL00028 16.07 4.300e- 








10 313-330 


53 7 


BL00762 


WHEP-TRS domain 
proteins '. 


BL00762A 23.43 9.419e- 
15 844-881 


538 


BL00762 


WHEP-TRS domain 
proteins . 


BL00762A 23.43 9.419e- 
15 819-856 


539 


BL00762 


WHEP-TRS domain 
proteins . 


BL00762A 23.43 9.419e- 
15 822-859 


540 


PR00985 


LEUCYL-TRNA SYNTHETASE 
SIGNATURE 


PR00985A 12.10 9.000e- 
10 357-375 


541 


PD02102 


SUBUNIT E V-ATPASE 
VACUOLAR ATP SYNTHASE 
HYDROL. 


PD02102A 16.74 l.OOOe- 
40 3-47 PD02102B 
18.28 4.375e-34 57-100 
PD02102D 21.69 1.923e- 
30 179-218 PD02102C 
26.34 8.929e-26 100- 
146 


543 


BL00028 


Zinc finger, C2H2 type, 
domain proteins. 


BL00028 16.07 l.OOOe- 
10 48-65 BL00028 
16.07 6.400e-10 193- 
210 BL00028 16.07 
l.OOOe-09 343-360 
BL00028 16.07 6.914e- 
09 78-95 


545 


BL002S0 


TGF-beta family 
proteins . 


BL00250A 21.24 8.000e- 
31 293-329 BL002S0B 
27.37 5.286e-24 354- 
390 


547 


PRO 03 19 


BETA G- PROTEIN 


PR00319B 11.47 2.714e- 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 






(TRANSDUCIN) SIGNATURE 


09 186-201 PR00319A 
15.27 7.344e-09 210- 
227 






sir Ajdppd - jd / nci/ aorsa jl 
domain proteins . 


HLO1204A 17.74 l.OUOe- 
40 8-56 BL01204D 
16.42 1.000e-40 177- 
221 BL01204E 13.83 

BL01204C 13.93 8.714e- 
22 141-160 BL01204B 
15.41 4.333e-16 102- 


54 9 


PR00326 


GTP1/OBG GTP- BINDING 

DPOTPTW PUMTT.V gT^TJZk'l'l |Up 


PR00326A 8.75 8.364e- 
lb ^bb - z /fa 


551 


PF00632 


HECT- domain (ubiquitin- 

LranoiciaSc/ * 


PF00632C 20.66 3.302e- 
^3 1569-1601 PrOOB32S 
18.45 3.700e-21 1515- 
1543 




tdt n noon 


l nuriuriog iodui inc ana 
major histocompatibility 
complex proteins. 


BL00290B 13.17 l.oOOe- 
14 187-205 BL0029OA 
20.89 2.059e-l4 130- 
153 


557 


DM00215 


PROLINE-RICH PROTEIN 3 . 


DM00215 19.43 6.339e- 
09 846-879 


559 


DM01111 


4 kw PHOSPHATASE 
TRANSFORMING 6 IK PDF! . 


DM01111L 11.93 3.762e- 
09 7-35 


562 


T% YtJ rt rt ^* r~ rt 

PF00658 


Poly-adenylate binding 
protein, unique domain 
proteins . 


PF00658C 16.33 9-455e- 
32 118-155 


564 


BL00141 


Eukaryotic and viral 
aspartyl proteases 
proteins . 


BL00141A 12.10 4.1S0e- 
10 472-488 


! 566 


PF00855 


PWWP domain proteins . 


PF00855 13.75 5.667e- 
15 272-289 


567 


FD010S6 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 
BINDING NU\ 


PD01066 19.43 4.977e- 
13 22S-26B 


569 


BL00107 

i 

4 

i 


Protein kinases AT?- 
binding region proteins. 


BL00107A 18.39 7 . OOOe- 
19 118-149 BL00107B 
13.31 5.500e-15 183- 
199 


570 


BL00107 

• 


Protein kinases ATP- j 
binding region proteins . 


BL00107A 18.39 7.000e- 
19 118-149 BL00107B 
13 . 31 5 . 500e-15 183- 
199 


572 


PR00193 


MYOSIN HEAVY CHAIN 
SIGNATURE 


PR00193D 14.36 1 . 857e- 
34 454-483 PR00193C 
12.60 2.636e-31 223- 
251 PR00193B 11.69 
7.750e-29 171-197 
PR00193A 15.41 2.588e- 
22 115-135 PR00193E 
19.47 6.559e-19 508- 
537 


Z> /J 


PR0019j 


MYOSIN HEAVY CHAIN 
SIGNATURE 


PR00193D 14.36 1.857e- 
34 470-499 PR00193C 

HO rt rt ^* ^ ^ ^ rt rt 

12.60 2.636e-31 239- 
267 PR00193B 11 69 
7.750e-29 171-197 
PR00193A IS. 41 2.588e- 
22 115-135 PR00193E 
19.47 6.559e-l9 524- 
553 


575 


BL00752 


XPA protein. 


BL00752B 19.17 9.703e- 
10 885-929 


576 


BL00030 


Eukaryotic RNA-binding 
region RNP-1 proteins. 


BL00030A 14.39 7-00Oe- 
09 276-295 


577 


BL00116 


DNA polymerase tamily B 


BL00116A 12.81 5.737e- 
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SEQ ID NO: 


ACCESS lOW 
NO. 


DESCRIPTION 


RESULTS* 






proteins . 


13 864-877 BL00116B 
11.82 l.S29e-12 952- 
965 


578 


BL00195 


Glutaredoxin proteins. 


BL00195B 15.31 7.158e- 
09 121-141 


579 


PR00019 


LEUCINE -RICH REPEAT 
SIGNATURE 


PR00019B 11.36 9.000e- 
11 217-231 PR00019B 
11.36 1.360e-09 386- 
400 PR00019A 11.19 
3.333e-09 389-403 
PR00019B 11.36 8.920e- 
09 363-377 


580 


PR00253 


GAMMA- AMINOBDTYRIC ACID 
(GABA) RECEPTOR 
SIGNATURE 


PR00253A 9.15 2.125e- 
25 275-296 PR00253B 
13.47 7.923e-24 301- 
323 PR00253D 16.68 
5.846e-23 444-465 
PR00253C 13.85 2.241e- 
20 335-357 


583 


PR00343 


SELECTIN SUPERFAMILY 
COMPLEMENT- BINDING 
REPEAT SIGNATURE 


PR00343C 16.85 2.286e- 
11 1233-1252 PR00343C 
16.85 5.500e-ll 333- 
352 PR00343C 16.85 
S.500e-ll 783-802 
PR00343C 16.85 4.246e- 
10 1491-1510 PR00343C 
16.85 8.230e-10 1686- 
1705 


584 


DM01537 


kw SKI2W SKI2 NUCLEOLAR 
HELICASE . 


DM0153 7B 21.63 1.878e- 
37 79-126 DM01537B 
21.63 9.49le-30 916- 
963 DM01537A 15.14 
3.136e-ll 784-804 


586 

• 


PFC0013 


KH domain proteins 
family of RNA binding 
proteins . 


PF00013 5.78 1.450e-09 
124-136 


587 


DM00892 


3 RETROVIRAL PROTEINASE. 


DM00892C 23.55 4.409e- 
13 262-296 


589 


BL004 78 


LIM domain proteins. 


BL00478B 14.79 1.643e- 
13 261-276 BL00478B 
14.79 7.709e-09 321- 
336 


590 


PF00855 


PWWP domain proteins. 


PF00855 13.75 S.OOOe- 
15 931-948 


591 


PF00855 


PWWP domain proteins. 


PF008S5 13.75 S.OOOe- 
15 1062-1079 


593 


PF00628 


PHD- finger . 


PF00628 15.84 3.455e- 
12 424-439 


594 


PRO 02 05 


CADHERIN SIGNATURE 


PR00205B 11.39 2.241e- 
16 S58-576 PR00205A 
14.73 9.300e-13 542- 
558 PR00205C 13.65 
5.304e-12 594-609 
PR00205B 11.39 4.273e- , 
10 336-354 


596 


BL0O107 


Protein kinases ATP- 
binding region proteins. 


BL00107A 18.39 4.789e- 
18 307-338 


598 


PD01675 


GLYCOPROTEIN MAJOR 
ENVELOPE PROBABLE U3 . 


PD01675C 19.89 2.330e- 
10 55-39 


600 


BL00242 


Integrins alpha t chain 
proteins . 


BL00242E 9.03 9.591e- 
27 985-1014 BL00242C 
16.86 4.1lSe-26 286- 
316 BL00242D 13.57 
4.150e-25 357-382 
BL00242B 8.13 7.353e- 
12 189-199 BL00242D 
13.57 3.455e-ll 421- 
446 BL00242A 13.80 
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ACCESSION 


DESCRIPTION 


RESULTS* 








S.000e-ll 61-73 
10 291-316 


601 


PR00320 


G- PROTEIN BETA WIV40 
REPEAT SIGNATURE 


PRO0320A 16 74 5 610e- 
09 198-213 


602 


PR00278 


PANCREATIC HORMONE 
SIGNATURE 


PRO0278A 12.43 4.569e- 
10 331-348 


603 


BL00479 


Phorool esters / 
diacyl glycerol binding 
domain proteins . 


BLi00479C 12 01 3 250e- 
12 170-183 


604 


BL00315 


Debydrins proteins. 


BL00315A 9.35 1.672e- 
09 424-452 


605 


BL00415 


Synapsins proteins. 


BL00415N 4.29 9.794e- 
10 295-339 


606 


PRO 0926 


MITOCHONDRIAL CARRIER 
PROTEIN SIGNATURE 


PR00926F 17.75 l.OOOe- 
13 335-358 


608 


PP0085S 


PWWP domain proteins . 


PF00855 13.75 5.167e- 
15 265-282 


609 


PP00855 


PWWP domain proteins . 


PF00855 13.75 5.167e- 
1S 211-228 


612 


DM01206 


CORONAVIRUS NUCLEOCAPSID . 
PROTEIN. 

• 


DM01206B 10.69 7.411e- 
10 877-897 DM01206B 
10.69 8.027e-10 861- 
881 DM01206B 10.69 
9.137e-10 873-893 
DM01206B 10:69 1.456e- 
09 859-879 DM01206B 
10.69 1.797e^09 879- 
899 DM01206B 10.69 
4.076e-09 865-885 
DM01206B 10.69 7.038e- 
09 898-918 DM01206B 
10.69 7.949e-09 871- 
891 DM01206B 10.69 
8.29le-09 767-787 


615 


PD02699 


PROTEIN DNA- BINDING 
BINDING DNA. 


PD02699A 8.91 2.023e- 
28 129-158 PD02699C 
24.84 1.000e-27 317- 
364 PD02699B 18.28 
1.000e-17 158-182 


616 


PR00380 


KINESIN HEAVY CHAIN 
SIGNATURE 


PR00380A 14.18 4.086e- 
22 288-310 PR00380D 
9.93 3.721e-17 486-508 
PR00380B 12.64 2.241e- 
16 410-428 PR00380C 
13.18 2.976e-13 436- 
455 ' 


617 


PR00380 


KINESIN HEAVY CHAIN 
SIGNATURE 

• 


PR00380A 14.18 4.0obe- 
22 288-310 PR00380D 
^.^j J. /zie- x# fioo — duo 
PR00380B 12.64 2.241e- 
16 410-428 PR003B0C 
13.18 2.976e-13 436- 

4 ^ 
~± -j ~j 


618 


DM01206 


CORONAVIRUS NUCLEOCAPSID 


DM012C6B 10.69 5.143e- 

X a pji"33i jyn w i v o 

10.69 2.603e-10 53S- 
555 


621 


PRO 0700 


PROTEIN TYROSINE 
PHOSPHATASE SIGNATURE. 


PR007C0B 16.80 3.160e- 
21 561-S82 


622 


BIiOD239 


Receptor tyrosine kinase 
class II proteins. 


B1*00239F 28.15 3.222e- 
10 647-692 BL00239C 
18.75 8.304e-10 543- 
566 


623 


PRO 04 07 


EUKARYOTIC MOLYBDOPTERIN 
DOMAIN SIGNATURE 


PR00407K 9.94 8.448e- 
09 326-339 


624 


BL00641 


Respiratory- chain NADH 
dehydrogenase 75 Kd 


BL00641C 21.10 l.OOOe- 
40 157-202 BL00641E 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION \ 


RESULTS* 






subunit proteins. 


24.37 1.000e-40 255- 
308 BL00641F 33.12 
1.000e-40 571-623 
BL00641A 17.15 1 . 818e- 
37 48-80 BIj00641B 
12.62 5.846e-34 113- 
139 BL00641D 13.23 
9.308e-29 216-240 


| 627 


PR00103 


CAMP - DEPENDENT PROTEIN 
KINASE SIGNATURE 


PR00103E 17.80 2.500e- 
18 367-380 PR00103B 
13.39 2.080e-14 297- 
312 PR00103A 9.59 
2.957e-14 282-297 
PR00103D 10.83 3.077e- 
12 346-358 PR00103C 
15.68 1.000e-ll 334- 
344 PR00103B 13.39 
1.450e-ll 175-190 
PR00103A 9.59 1.720e- 








10 160-175 


630 


PR00081 


GIiUCOSE/RIBITOL 
DEHYDROGENASE FAMILY 
SIGNATURE 


PR00081A 10.53 6.211e- 
16 4-22 


fill 


PPOOS51 


BTB (also known as BR- 
C/Ttk) domain proteins. 


PF00651 15.00 8.500e- 
14 37-50 


632 


DM01206 


CORONAVIRUS NUCLEOCAPSID 
PROTEIN. 


DM01206B 10.69 2.233e- 
10 1324-1344 DM01206B 
10.69 4.822e-10 1276- 
1296 DM01206B 10.69 
7.658e-10 1328-1348 
DM01206B 10.69 8.274e- 
10 1280-1300 DM01206B 
10.69 4.532e-09 1320- 
1340 DM01206B 10.69 
7.266e-09 1326-1346 


635 


3L00107 


Protein kinases ATP- 
binding region proteins. 


BL00107A 18.39 7.600e- 
23 145-176 BL00107B 
13.31 2.636e-13 211- 
227 


636 


BLG0657 


Fork head domain 
proteins. 


BL00657A 19.39 1.545e- 
30 101-143 BL00657B 
22.27 7.750e-26 149- 
192 


637 


BL00107 


Protein kinases ATP- 
binding region proteins. 


BL00107B 13.31 l.OOOe- 
10 607-623 


643 


BL00018 


EF-hand calcium- binding 
domain proteins. 


BL00018 7.41 4.913e-09 
199-212 


647 


PF00628 


PHD-f inger . 


PF00628 15.84 2.350e- 
13 385-400 PF00628 
15.84 3.455e-12 464- 
479 


648 


BL01129 


Hypothetical 
yabO/yceC/sfhB family 
proteins . 


BLQ1129E 13.25 4.000e- 
25 332-357 BL01129C 
25.56 8.200e-23 236- 
279 BIi01129B 12.51 
6.118e-13 191-212 


649 


BLi01228 


Hypothetical cof family 
proteins . 


BL01228D 17.44 3.908e- 
10 455-480 


650 


BL00027 


1 Homeobox ■ doma i n 
proteins . 


BLO0O27 26.43 6.684e- 
13 771-814 


651 


BL500 02 


Src homology 3 (SH3) 
domain proteins profile. 


BL5C002A 14.19 1.750e- 
12 1026-1045 


653 


PR00253 


GAMMA- AMINOBUTYRIC ACID 
(GABA) RECEPTOR 
SIGNATURE 


PR00253A 9.15 4.000e- 
24 253-274 PR00253C 
13.85 8.800e-24 313- 
335 PR002S3B 13-47 
3.143e-22 279-301 
PR00253D 16.68 7.652e- 
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RESULTS * 






• 


20 422-443 


654 


PD01719 

■ 


PRECURSOR GLYCOPROTEIN 
SIGNAL RE. 


PD01719A 12.89 4.452e- 
11 969-997 PD01719A 
12.89 3.961e-10 128- 
156 PD01719A 12.89 
7.395e-10 1276-1304 
PD01719A 12.89 1.222e- 
09 1220-1248 


657 


BL00354 


HMG-I and HMG-Y DNA- 
binding domain proteins 
(Ahook) . 


BL00354C 6.61 8.397e- 
09 563-S78 


658 


BLO0354 


HMG-I and HMG-Y DNA- 
binding domain proteins 
(Ahook) . 


BL00354C 6.61 8.397e- 
09 580-595 


659 


DM00215 


PROLINE-RICH PROTEIN 3. 


DM00215 19.43 2.174e- 
13 539-572 DM00215 
19.43 4.750e-12 549- 

582 DM00215 19.43 
9.824e-ll 551-584 
DM00215 19.43 2.929e- 
10 548-581 DM00215 
19.43 4.054e-lC 550- 

583 DM00215 19.43 
5.339e-10 552-585 
DM00215 19.43 7.107e- 
10 544-S77 


660 


PR00688 


XYLOSE ISOMERASE 
SIGNATURE 


PR00688I 13.78 9-518e- 
09 224-236 


661 


BL00027 


• Homeobox ' d oraain 
proteins. 


BL00027 26.43 S.950e- 
23 249-292 


662 


PRO0360 


C2 DOMAIN SIGNATURE 


PRO036OB 13.61 7.158e- 
10 596-610 


663 


PRO03 60 


C2 DOMAIN SIGNATURE 


PR00360B 13.61 7.158e- 
10 596-610 


664 


PRO0360 


C2 DOMAIN SIGNATURE 


PR00360B 13.61 7.158e- 
10 596-610 


666 


PR00819 


CEXX/CFQX SUPERFAMILY 
SIGNATURE 


PR00819B 10.83 8.90Ge- 
10 704-720 


667 


BL50040 


Elongation factor 1 
gamma chain profile. 


BL50040C 22.62 2.143e- 
16 135-178 


668 


PRO0019 


LEUCINE- RICH REPEAT 
SIGNATURE 


PR00019B 11.36 1.360e- 
09 139-153 PR00019A 
11.19 1.667e-09 94-108 
PR00019B 11.36 4.600e- 
09 163-177 


670 


BL00018 


EF-hand calcium- binding 
domain proteins . 


BL0001B 7.41 3.25Ce-10 
681-694 BL00018 7.41 
6.400e-10 717-730 


672 


PD00131 


ATP -BINDING TRANSPORT 
TRANS MEMBR . 


PD00131B 34.97 l.OOOe- 
34 356-410 PD00131C 
19.59 1.346e-26 504- 
542 


673 


PR00667 


RETINAL PIGMENT 
EPITHELIUM-RET ZNAL GPCR 
SIGNATURE 


PR00667G 15.33 7.557e- 
10 106-123 


674 


PRO 03 20 


G- PROTEIN BETA WD- 40 
REPEAT SIGNATURE 


PR00320A 16.74 4.857e- 
13 593-608 PR00320B 
12.19 4.115e-12 635- 
650 PR00320C 13.01 
8.435e-ll 717-732 
PR00320C 13.01 2.800e- 
10 63 5-650 PR00320C 
13.01 6.400e-10 593- 
608 PR00320B 12.19 
3.250e-O9 593-60B 


675 


PR00320 


G- PROTEIN BETA WD- 40 
REPEAT SIGNATURE 


PR00320A 16.74 4.857e- 
13 572-S87 PR00320B 
12.19 4.1lSe-12 614- 
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NO. 


DESCRIPTION 


RESULTS* 








629 PR00320C 13.01 
8.435e-ll 696-711 
, PR00320C 13.01 2.800e- 
10 614-629 PR00320C 
13.01 6.400e-10 572- 
587 PR0032OB 12.19 
3.250e-09 572-587 


676 


PRO 00 19 


LEUCINE- RICH REPEAT 
SIGNATURE 


PR00019A 11. IS 9.667e- 
09 249-263 


679 


PF00642 


Zinc finger c-x8-c-x5-c- 
x3-H type {and similar) . 


PF00642 11. S9 3.700e- 
16 225-236 PP00642 
11.59 7.900e-12 187- 
198 


680 


PR0030B 


TYPE I ANTIFREEZE 
PROTEIN SIGNATURE 


PR003O8C 3.83 8.754e- 
10 286-296 


681 


BL00019 


Act inin- type actin- 
binding domain proteins. 


BL00019D 15-33 4.200e- 
19 227-257 


682 


PRO 07 00 


PROTEIN TYROSINE 
PHOSPHATASE SIGNATURE 


PR00700D 12.47 4 . 000e- 
09 99-118 


687 


PR00049 


WILM'S TUMOUR PROTEIN 
SIGNATURE 


PR0D049D 0.00 8.500e- 
10 538-553 


689 


BL01024 


Protein phosphatase 2A 
regulatory svlduxij-l. ir t\.z^ ~> 
proteins . 


BL01024A 10.26 1.000s- 

8.91 l.OOOe-40 86-127 
7 SO 1 fl 0 Of- 

40 146-185 BL01024D 
13 22 1.000e-40 185- 
222 BL01024E 11.96 

I. OOOe-40 222-266 
BL01024F 9.42 l.OOOe- 
40 266-317 BL01024G 

II. 09 l.OOOe-40 317- 
349 BL01024H 13.88 
l.OOOe-40 389-442 


691 


BL00027 


* Horaeobox 1 domain 
proteins . 


BL00027 26.43 8.071e- 
31 1S2-195 


692 


BL00211 


ABC transporters family 
proteins . 


BL00211A 12.23 5.050e- 
09 45-57 


693 


BL00211 


ABC transporters family 
proteins . 


BL00211A 12.23 5.050e- 
09 45-S7 


694 


BL00211 


ABC transporters family 
proteins . 


BL00211A 12.23 5.050e- 
09 58-70 


696 


BL00680 


Methionine 

aminopeptidase subfamily 
1 proteins . 


BL00680 14.37 5.304e- 
17 173-195 


697 


BL00741 


Guan ine - nucl eot ide 
dissociation stimulators 
CDC24 family sign. 


BL00741B 14.27 3.418e- 
11 242-265 


698 


DM01930 


2 kw FINGER SMCX SMCY 
YDR096W. 


DM01930E 15.41 1.367e- 
37 170-215 DM01330F 
14.16 B.232e-28 267- 
303 DM01930B 19.86 
9.163e-10 37-71 


700 


PR00869 


DNA- POLYMERASE FAMILY X 
SIGNATURE 


PR00869A 12.80 1.281e- 
16 245-263 


701 


PR0004 8 


C2H2-TYPE ZINC FINGER 
SIGNATURE 


PR00048A 10.52 2.174e- 
10 77-91 PR00048A 
10.52 6.870e-10 133- 
147 PR0004 8A 10.52 
8.826e-10 105-119 
PR00048A 10.52 5.320e- 
09 161-175 


702 


BL00S23 


Sulfatases proteins. 
, 


BL00523E 19.27 2.565e- 
25 326-356 BL00523A 
13.36 5.050e-16 38-55 
BL00523B 8.64 5.909e- 
15 86-98 BL00523C 
12.64 5.500e-13 137- 
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ACCESSION 
NO. 
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RESULTS* 








148 BL00523D 9.89 
1.844e-ll 290-302 
BL00523G 9.46 5.500e- 
10 513-523 BL00523F 
10.85 6.351e-09 413- 
424 


703 


PR00048 


C2H2 - TYPE ZINC FINGER 
SIGNATURE 


PR00048A 10.52 8.412e- 
12 376-390 PR00048B 
6.02 1.000e-10 334-344 
PR00048B 6.02 1.474e- 
09 364-374 


707 


PD00787 


SYNTHASE BIOSYNTHESIS 
TRANSFERASE. 


PD007G7A 14.84 8.941e- 
14 66-82 


708 


PR00761 


BINDIN PRECURSOR 
SIGNATURE 


PR00761E 14.32 8 . 500e- 
10 822-841 


712 


DM01354 


kw TRANSCRIPTASE REVERSE 
II ORF2. 

• 

* 


DM01354Y 10.69 4.977e- 
38 425-465 DM01354X 
13.86 7.300e-34 376- 
415 DM01354V 12.97 
4.923e-17 311-358 
DM01354W 12.64 5.596e- 
10 3S6-376 


713 


BL00039 


DEAD- box subfamily ATP- 
dependent helicases 
proteins. 


BL00039D 21.67 7.54Se- 
27 4SO-496 BL00039A 
18.44 2.537e-18 147- 
186 BL00039C 15.63 
2.2l6e-14 280-304 
BL00039B 19.19 1.947s- 
13 194-220 


715 


BL003 83 


Tyrosine specific 
protein phosphatases 
proteins. 


BL00383E 10.35 4.981e- 
10 15D-161 


| 717 


PF00777 


Sialyl transferase 
family. 


PF00777C 18.60 4.035e- 
21 106-161 


| 718 


DM00031 


IMMUNOGLOBULIN V REGION. 


DM00031A 16.80 3.7S0e- 
39 20-66 DM00031B 
15.41 2.688e-28 84-118 
DM0003XC 12.79 1.300e- 
12 131-142 


| 719 


BL0024 3 


Integrins beta chain 
cysteine-rich domain 
proteins. 


BL00243B 17.54 l.OOOe- 
40 131-172 BL00243C 
16.42 1.000e-40 172- 
208 BL00243D 24.07 
1.000e-40 222-274 
BL00243F 22.63 l.OOOe- 
40 314-358 BL00243I 
31.77 6.571e-39 607- 
650 BL00243E 16.70 
3.077e-35 274-304 . 
BL00243G 21.38 3.625e- 
34 358-400 BL00243H 
17.53 5.235e-29 567- 
593 BL00243A 17.61 
3.250e-21 63-84 
BL00243H 17.53 7.167e- 
16 477-503 BL00243H 
17.53 2.304e-ll 524- 
550 BL00243H 17.53 
5.304e-ll 606-632 
BL00243I 31.77 1.380e- 
09 610-653 


720 


PRO 0217 


43 KD POSTSYNAPTIC 
PROTEIN SIGNATURE 


PR00217C 10.91 8.022e- 
09 20-36 


722 


PR00704 


CALPAIN CYSTEINE 
PROTEASE (C2) FAMILY 
SIGNATURE 


PR00704D 11.05 5.909e- | 
34 13S-161 PR00704P 
13.61 7.O00e-26 190- 
218 PR00704E 12.55 
8.071e-26 165-189 
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RESULTS* 








PR00704B 17.94 2.241e- 
23 75-98 PR00704A 
14.68 4.094e-19 30-54 
PR00704C 11.88 1.871e- 
18 99-116 


725 


PR00194 


TROPOMYOSIN SIGNATURE 


PR00194A 7.86 7.652e- 
09 169-187 


726 


PR00194 


TROPOMYOSIN SIGNATURE 


PR00194A 7.86 7 . 652e- 
09 169-187 


727 


PRO 03 20 


G- PROTEIN BETA WD- 40 
REPEAT SIGNATURE 


PR00320C 13.01 2.125e- 
13 277-292 PR00320A 
16.74 1.310e-ll 277- 
292 PR00320C 13 . 01 
4.522e-ll 323-338 
PR00320A 16.74 6.586e- 
11 323-338 PR00320B 
12.19 4.343e-10 323- 
338 PR00320B 12.19 
6.914e-10 277-292 


731 


PR00195 


DYNAMIN SIGNATURE 


PR00195A 11.94 8 . 627e- 
16 288-307 PR00195E 
9.82 3.912e-ll 457-474 


733 


PF00642 


Zinc finger C-x8-C-xS-C- 
x3-H tvoe (and 'similar) . 


PF00642 11.59 9.082e- 
10 787-798 


738 


BL00039 


DEAD -box subfamily ATP- 
deisendeiit helicases 
proteins . 


BL00039A 18.44 2 . 565e- 
28 26-65 BL00039D 
21.67 2.105e-20 338- 
384 EL00039C 15.63 
9.100e-13 160-184 
BL00039B 19.19 9.617e- 
11 73-99 


739 


BL01289 


TSC-22 / dip / bun 
family proteins. 


BL01289A 12.18 8.909e- 
31 326-353 BL01289B 
10.45 9.571e-17 353- 
383 


742 


BL01019 


ADP-ribosylacion factors 
family proteins. 


BL01019A 13.20 7.078e- 
12 41-81 


743 


BL00965 


Phosphomannose isomerase 
type I proteins. 


BL00965C 23.78 l.OOOe- 
40 256-305 BL00965B 
17.77 1.600e-25 126- 
153 BL00965A 10.57 
6.400e-19 94-113 


i 747 


BL00021 


Kringle domain proteins. 


BL00021D 24-56 4.563e- 
25 231-273 BL00021B 
13.33 5.345e-21 60-78 


748 


BL00612- 


Osteonectin domain 
proteins. 


BL00612B 11.35 2.034e- 
11 93-126 


749 


PR00450 


RECOVER IN FAMILY 
SIGNATURE 


PR00450C 12.22 6.880e- 
10 135-1S7 


752 


BL00795 


Involucrin proteins . 


BL00795C 17.06 6.000e- 
11 384-429 BL00795C 
17.06 9.444e-ll 370- 
415 


754 


BL00051 


Ribosomal protein L39e 
proteins . 


BL00051 20.92 1.93Se- 
16 4-50 


755 


DM01970 


0 kw ZK632.12 YDR313C 
END0S0MAL III. 


DM01970B 8.60 7.723e- 
09 171-184 


760 


BL01020 


SARI family proteins . 


BL01020C 15.3S 9.020e- 
12 99-150 


762 


3L00046 


Histone H2A proteins. 


BL00046 12.95 l.OOOe- 
40 33-88 


763 


PD02411 


PROTEIN TRANSCRIPTION 
REGULATION NUCLEAR. 


PD02411 21.89 9.137e- 
10 206-240 


764 


BIj00027 


' Homeobox » doma in 
proteins. 


BL00027 26.43 8.800e- 
29 417-460 


767 


BIi01208 


VWFC domain proteins. . 


BL01208B 15.83 6-063e- 
10 309-324 BL012O8B 
15.83 8.031e-10 165- 
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SEQ ID NO: 



ACCESSION 
NO. 



770 



772 



BL00031 



PRO0449 



773 



BL0O523 



775 



BL0O028 



DESCRIPTION 



RESULTS' 



180 BL01208B 15.83 
4.l62e-09 85-100 



Nuclear hormones 
receptors DNA- binding 
region proteins. 



BL00031A 19.55 9.571e- 
32 -208-241 BL00031B 
22.25 5.500e-27 242- 
274 



TRANSFORMING PROTEIN P2T 
RAS SIGNATURE 



PR00449A 13.20 1.4S0e- 
18 4-26 PR00449E 
13.50 3.520e-14 142- 
165 PR00449C 17.27 
3.032e-13 44-67 
PR00449D 10.79 8.579e- 
13 107-121 PR00449B 
14.34 3.455e-ll 27-44 



Sulfatases proteins 



BL00S23E 19.27 9.333e- 
23 299-329 BL00523A 
13.36 2.200e-13 47-64 
BL00523B 8.64 2.607e- 
13 91-103 BJ,00523D 
9.89 7.923e-12 224-236 
BLC0523C 12.64 4.512e- 
10 141-152 BLO0523F 
10.85 5.821e-10 373- 
384 



Zinc finger, C2H2 type, 
domain proteins . 



Zinc finger, C2H2 type 
domain proteins 



BL00028 16.07 7.686e- 
09 568-585 



776 



BL0O02 8 



777 
"77F 



BL0 002 8 



Zinc finger, C2H2 type, 
domain proteins 



BL00028 16.07 7.686e- 
09 621-638 



BL00028 16.07 7.686e- 
09 595-612 



BL00 030 



Eukaryotic RNA- binding 
region. RNP-l proteins . 



BL00030A 14.39 8.4l2e- 
11 322-341 BL00030A 
14.39 7.000e-10 220- 
239 



779 



PR00079 



781 



BL0 0215 



783 



PD002B9 



785 



BL00690 



786 



PR00449 



788 



DM01206 



790 



BL00915 



GLUCOS E - 6 - PHCS PHATE 
DEHYDROGENASE SIGNATURE 



PR00079B 12.98 2.929e- 
26 193-222 PR00079E 
16.65 4.l50e-23 348- 
375 PR00079C 8.68 
6.351e-16 246-264 
PR00079D 13 .51 7 . 070e- 
16 264-281 PR00079A 
16.12 6.769e-13 169- 
183 



Mitochondrial energy 
transfer proteins. 



BL00215A 15.82 9.250e- 
17 10-35 BL00215A 
15.82 6.000e-16 221- 
246 BL00215A 15.82 
7.857e-12 108-133 
BL00215B 10.44 9.526e- 
11 168-181 



PROTEIN SH3 DOMAIN 
REPEAT PRESYNA. 



PD00289 9.97 6.276e-09 
159-173 



DRAH-box subfamily ATP- 
dependent heli cases 
proteins . 



BL00690B 13.38 l.OOOe- 
12 147-165 BL00690A 
6.87 5.320e-10 114-124 
BL00690C 7.51 3.189e- 
09 218-228 



TRANSFORMING PROTEIN P21 
RAS SIGNATURE 



C0RONAVIRUS NUCLEOCAPSID 
PROTEIN. 



PR00449C 17.27 8.500e- 
16 50-73 PR00449A 
13.20 5.235e-14 8-30 
PR00449E 13.50 2.853e- 
11 150-173 PR00449D 
10.79 1.545e-09 111- 

j.25 

DM01206B 10.69 8.767e- 
10 1-21 



Phosphatidyl inositol 3- 
and 4 -kinases proteins. 



BL00915C 22.43 9.182e 
39 725-764 BL00915B 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








22.78 5.050e-33 633- 
671 BL00915D 27.02 
1.529e-21 795-831 
BL00915A 10.09 l.OOOe- 
13 395-407 


791 


PR00208 


G I* IAD IN AND LMW GLUTEN IN 
SUPER FAMILY SIGNATURE 


PR00208A 12.59 6.294e- 
10 120-138 PR00208A 
12.59 6.294e-10 121- 
139 PR00208A 12.59 
6.294e-10 122-140 
PR00208A 12.59 6.294e- 
10 123-141 PR00208A 
12. S9 6.294e-10 124- 
142 PR00208A 12.59 
6.294e-10 125-143 
PR00208A 12.59 6.294e- 
10 126-144 PR00208A 
12.59 6.294e-10 127- 
145 PR00208A 12.59 
6.294e-10 128-146 
PR0O208A 12.59 6.294e- 
10 129-147 PR00208A 
12.59 7.411e-09 130- 
148 PR00208A 12.59 
7.658e-09 131-149 
PR00208A 12.59 7.904e- 
09 132-150 PR00208A 
12.59 8.274e-09 118- 
136 PR00208A 12.59 
8.274e-09 119-137 


795 


PR00205 


CADHERIN SIGNATURE 


PR00205B 11.39 5.034e- 
16 302-320 PR00205A 
14.73 1.257e-ll 284- 
300 PR00205C 13.65 
1.333e-ll 337-352 


796 


B 1.00412 


Neuromodulin (GAP- 43) 
proteins . 


BL0O412D 16.54 4.000c- 
12 196-247 BL00412D 
16.54 5.705e-ll 197- 
248 BL00412D 16.54 
7.848e-10 199-250 
BL00412D 16.54 1.827e- 
09 19S-246 BL00412D 
16.54 1.9l8e-09 194- 
245 BL00412D 16.54 
2.102e-09 201-252 


797 


j BL00021 


Kr ingle domain proteins. 


BL00021B 13.33 6.339e- 
13 40-58 


799 

* 


B 1*01052 


Calponin tamily repeat 
proteinc . 

• 


BL01052C 18.51 1 . OOOe- 
40 87-127 BL01052A 
16.12 1.529e-32 3-35 
BL01052B 15.31 l,257e- 
25 52-78 BL01052D 
10.26 5.737e-25 174- 
194 


800 


BLO0348 


p53 tumor antigen 
proteins . 


BL00348F 23.19 3.714e- 
09 197-240 


801 


BLO0309 


Vertebrate galactoside- 
binding lectin proteins . 


BL00309C 18.65 1.621e- 
09 62-87 


802 


PR00245 


OLFACTORY RECEPTOR 
SIGNATURE 


PR00245D 10.47 5.224e- 
09 187-199 


804 


PF00774 


Dihydropyridine 
sensitive L-type -calcium 
channel (Beta subuni . 


PF00774A 16.47 8.457e- 
10 110-156 


808 


PRO 06 67 


RETINAL PIGMENT 
EPITHELIUM-RETINAL GPCR 
SIGNATURE 


PR00667C 11.71 9.875e- 
09 12-28 


BIO 


PD02346 


PHOTOSY STEM II PROTEIN 
PRECURSOR 


PD02346F 12.89 4.340e- 
09 317-354 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 






PHOTOSYNTHESIS . 




811 


BL00685 


CBF-A/NP-YB subunit 
proteins. 


BL00685B 14.41 6 . 779e- 
14 54-95 BL00685A 
11.22 4.798e-l3 5-54 


812 


PR00080 


: ALCOHOL DEHYDROGENASE 
SUPERFAMILY SIGNATURE 


PR00080A 9.32 9.4l9e- 
10 93-105 


813 


BL00357 


Histone H2B proteins. 


BL00357 7.74 1.988e-17 
22-65 


815 


PD00056 


PROTEIN ZINC- FINGER 
METAL -BIND I . 


PD00066 13.92 7.923e- 
15 158-171 PD00066 
13.92 5.200e-14 46-59 
PD00066 13.92 7.000e- 
•14 18-31 PD00066 
13.92 7.000e-l3 130- 
143 PD00066 13.92 
7.500e-13 214-227 
PD00066 13.92 9.000e- 
13 102-11S PD00066 
13.92 4.429e-l2 186- 
199 PD00066 13.92 
1.783e-ll 74-87 


816 


BL01195 


Pep t idyl - 1 RNA hydrolase 
proteins. 


BL01195C 20.12 3.348e- 
20 100-139 


820 


BLC0520 


Interleukin-10 family 
proteins. 


BL00520A 6.21 6.471e- 
09 1-14 


822 


BL00972 


Ufa i qui tin carboxyl- 
terminal hydrolases 
family 2 proteins. 


BL00972A 11.93 8.113e- 
09 224-242 


825 


PR00876 


NEMATODE MBTALLOTH I ONE IN 
SIGNATURE 


PR00876B 7.66 2.268e- 
10 101-115 


829 


PD02855 

• 


FLAVOPROTEIN PROTEIN 
DNA/ PANTOTHEN . . 


PD02855A 18.37 4.732c- 
28 88-124 PD02855B 
8.36 6.478e-09 132-142 


830 


PRO 04 05 


HIV REV INTERACTING 
PROTEIN SIGNATURE 


PR00405B 11.83 7.000e- 
21 44-62 PR00405C 
19.41 1.000e-13 65-87 
PR00405A 17.71 7.283e- 
13 25-45 


831 


PR00019 


LEUCINE -RICH REPEAT 
SIGNATURE 


PR00019A 11.19 l.OOOe- 
09 47-61 PR00019B 
11.36 1.720e-09 136- 
150 PR00019B 11.36 
3 . 880e-09 44-58 


832 


PR00011 

• 


j TYPE III EGF-LIKE 
SIGNATURE 

■ 


PROOOllB 13.08 3.438e- 
16 164-183 PROOOllD 
14.03 6.850e-l6 164- 
183 PROOOllA 14.06 
8.364e-14 164-183 
PROOOllC 24 .25 5.415e- 1 
12 231-260 PROOOllD 
14.03 9.852e-ll 212- 
231 


834 


PD00306 


PROTEIN GLYCOPROTEIN 
PRECURSOR RE. 


PD00306A 10.26 7.000e- 
12 232-246 


835 


PD00306 


PROTEIN GLYCOPROTEIN 
PRECURSOR RE. 


PD00306A 10.26 4.000e- 
10 290-304 


836 


PD00306 


PROTEIN GLYCOPROTEIN 
PRECURSOR RE. 


PD00306A 10.26 7.000e- 
12 216-230 


837 


DM00215 


PROLINE -RICH PROTEIN 3 . 


DM00215 19.43 3.898e- 
09 78-111 


839 


PD02784 


PROTEIN NUCLEAR 
RIBONUCLEOPROTEIN . 


PD02784B 26.46 8.302e- 
09 73-116 


840 


PR00700 


PROTEIN TYROSINE 
PHOSPHATASE SIGNATURE 


PR00700B 16.80 5.091e- 
22 369-390 PR00700D 
12.47 5.765e-21 491- 
510 PR00700C 13.17 
4.750e-14 449-467 
PR00700F 11.18 8.500e- 
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SEQ ID NO: 


ACCESSION 
NO 


DESCRIPTION 


RESULTS* 








11 538-549 PR00700E 
17.57 3.100e-10 522- 
538 


o ^* 


PRO 010 9 


TYROSINE KINASE 
CATALYTIC DOMAIN 
SIGNATURE 


PR00109B 12.27 5.404e- 
13 134-153 


844 


PD02785 


PROTEIN RIBOSOMAL 60S 
L22 RNA- BINDING KEP. 


PD02785B 14.43 l.OOOe- 
40 58-112 PD02785A 
15.23 1.915e-28 8-57 


845 


BLC0826 


MARCKS family proteins. 


BL00826C 7.63 6.738e- 
09 203-230 


846 


BL0OS18 


Zinc finger, C3HC4 type 
(RING finger), proteins. 


BL00518 12.23 4.429e- 
10 15-24 


849 


BIj00518 


Zinc finger, C3HC4 type 
(RING finger), proteins. 


BL00518 12.23 l.OOOe- 
08 340-349 


850 


PR00308 


TYPE I ANTIFREEZE 
PROTEIN SIGNATURE 


PR00308A 5.90 6.506e- 
09 12-27 


851 


PD02411 


PROTEIN TRANSCRIPTION 
REGULATION NUCLEAR. 


PD02411 21.89 7.000e- 
16 246-280 


852 


BLOO4 2 0 


Speract receptor repeat 
proteins domain 
proteins. 

* 


BL00420B 22.67 l.OOOe- 
40 723-778 BL00420B 
22.67 1.321e-38 933- 
988 BL00420B 22.67 
8.457e-28 482-537 
BL00420B 22.67 4.500e- 
27 587-642 BL00420B 
22.67 9.625e-27 270- 
325 BL00420B 22.67 
4.205e-26 163-218 
BL00420B 22.67 5.731e- 
23 55-110 BLO0420B 
22.67 6.464e-20 377- 
432 BL00420B 22.67 
2.800e-15 830-885 
BL00420C 11.90 1.900e- 
13 355-366 BL00420C 
11.90 1.900e-12 808- 
819 BL00420C 11.90 
3.550e-12 248-259 
BL00420C 11.90 2.831e- 
11 141-152 BL00420C 
11.90 5.119e-ll 1018- 
1029 BL00420C 11.90 
7.955e-10 567-578 


853 


BL00420 


Speract receptor repeat 
proteins domain 
proteins . 


BL00420B 22.67 l.OOOe- 
40 756-811 BL.00420B 
22.67 1.321e-38 966- 
1021 BL00420B 22.67 
8.457e-28 482-537 
BL00420B 22.67 4.500e- 
27 620-675 BL00420B 
22.67 9.625e-27 270- 
325 BL00420B 22.67 
4.20Se-26 163-218 
BL00420B 22.67 5.73le~ 
23 55-110 BL00420B 
22.67 6.464e-20 377- 
432 BL00420B 22.67 
2-800e-15 863-918 
BL00420C 11.90 1.900e- 
13 355-366 BL00420C 
11.90 1.900e-12 841- 
852 BL00420C 11.90 
3.550e-12 248-259 
BL00420C 11.90 2.831e- 
11 141-152 BL00420C 
11.90 S.ll9e-ll 1051- 
1062 BL00420C 11.90 
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SBQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








7.955e-10 567-576 


857 


PRO 03 8 8 


3 1 , 5 » - CYCLIC NUCLEOTIDE 
CLASS II 

PHOSPHODI ESTERASE 
SIGNATURE 


PR00388A 10.45 2.778e- 
09 64-83 


859 


BL00030 


Eukaryotic RNA- binding 
region RNP-1 proteins. 


BL00030A 14.39 2.929e- 
13 37-56 BL00030B 
7.03 1.900e-ll 167-177 
BL00030A 14.39 2.000e- 
10 128-147 


861 


PR00988 


URIDINE KINASE SIGNATURE 


PR00988A 6.39 4.2S0e- 
17 23-41 PR00988C 
13.64 8.714e-16 107- 
123 PR00988F 12.23 
7.828e-l5 198-212 
PR00988E 8.27 9.769e- 
12 176-188 PR00988D 
5.95 8.250e-ll 163-174 
PR00988B 11.60 4.5l2e- 
10 60-72 


863 


BL0 0215 


Mitochondrial energy 
transfer proteins . 


BL00215B 10.44 B.071e- 
12 41-54 


664 


PR00775 


90 KD HEAT SHOCK PROTEIN 
SIGNATURE 


PR00775E 8.06 l.OOOe- 
24 198-221 PR00775B 
3.52 1.837e-23 107-130 
PR00775D 8.91 4.484e- 
17 171-189 PR00775A 
9.90 8.342e-17 86-107 
PR00775C 10.68 9.379e- 
17 153-171 PR00775G 
10.64 6.850e-15 267- 
286 PR00775F 12.76 
6.769e-14 249-267 


656 


DM01688 1 2 POLY-IG RECEPTOR . 

1 

■ 

i 


DM01688G 16.45 9.460e- 
09 B9-121 


867 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 
BINDING NU. ! 


PD01066 19.43 5.596e- 
29 14-53 


668 


BL01287 


RNA 3 * -terminal 
phosphate cyclase 
proteins . 


BL01287A 17.95 2.688e- 
26 16-48 


869 


DM00215 


PROLINE-RICH PROTEIN 3. 


DM00215 19.43 6.464e- 
10 304-337 


872 


BL0004 6 


Hlstone H2A proteins. 


BL00046 12.95 l.OOOe- 
40 30-85 


874 


BL00188 


Biotin-requiring enzymes 
a 1 1 achraent site 
proteins . 


BL00188 30.29 9.036e- 
32 665-711 


876 


BL00028 


Zinc finger, C2H2 type, 
domain proteins. 


BL00028 16.07 7.686e- 
09 298-315 


877 


PD02102 


SUBUNIT E V-ATPASE 
VACUOLAR ATP SYNTHASE 
HYDROL. 


PD02102A 16.74 4.176e- 
10 97-141 


879 


BL01189 


Ribosomal protein S12e 
proteins. 


BL01189A 14.27 l.OOOe- 
40 35-71 BL01189B 
13.49 1.000e-40 71-125 


882 


BL00284 


Serpins proteins . 


BL00284C 28.56 6.400e- 
25 62-104 BL00284B 
17.99 6.182e-12 35-56 


889 


BL00216 


Sugar transport 
proteins . 


BL00216B 27.64 4.375e- 
21 3.5-85 


896 


PRO 03 91 


PHOS PHATIDYLINOSITOL 
TRANSFER PROTEIN 
SIGNATURE 


PR00391E 12.50 7.785e- 
15 211-231 PR00391B 
8.39 1.000e-13 83-104 
PR00391D 12.21 9.328e- 
13 191-207 PR00391A 
7.83 5.390e-ll 16-36 


I 897 


PR00327 


ICE NUCLEATION PROTEIN 

* 


PR00327C 6.37 5.247e- 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 






SIGNATURE 


09 313-328 


898 


BL00039 


DSAD-box subfamily ATP- 
dependent he li cases 
proteins . 


BL00039D 21.67 7.800e- 
26 386-432 BL00039A 
18.44 6.674e-16 113- 
152 BL00039B 19.19 
1.947e-13 1S3-179 
BLO003 9C 15 63 9 4 60ft- 
11 236-260 


am 




METAL- BINDI . 


PDOOOS6 13 92 ft OflOp- 
16 254-267 PD00066 
13 92 8 200e-16 282- 
295 PD00066 13.92 
8.200e-16 310-323- 
PD00066 13.92 8.200e- 
16 366-379 PD00066 
13.92 8.200e-16 394- 
407 PD00066 13.92 
8.200e-14 338-351 


902 


BL01115 


GTP-binding nuclear 
protein ran proteins. 


BL01115A 10.22 9.321e- 
11 6-50 


903 


PR00806 


VINCULIN SIGNATURE 


PR00806B 4.28 9.160e- 
09 97-111 


904 


PRO 03 81 


KINESIN LIGHT CHAIN 
SIGNATURE 


PR00381E 8.75 6.586e- 
25 335-356 PR00381B 
18.17 2.667e-24 204- 
224 PR00381A 9.55 
2.800e-24 107-125 
PR00381C 12.48 4.522e- 
24 226-245 PR00381D 
13.94 1.084e-22 291- 
309 PR00381F 9.13 
3.288e-22 370-392 
PROOJolr 9.13 7.18le- 
13 286-308 PR00381E 

PR00381E 8.75 7.033e- 

8.75 8.364e-10 377-398 
PR00381D 13.94 5.230e- 
09 333-351 PR00381C 
12.48 7.120e-09 310- 
329 


906 


PR00345 


STATHMIN FAMILY 
SIGNATURE 


PR00345C 4.54 B.557e- 
09 525-549 

^m ^m mm mm -~ 


907 


PR00345 


STATHMIN FAMILY 
SIGNATURE 


PR00345C 4.54 8.557e- 
09 513-537 


908 


BL0O67B 


Trp-Asp (WD) repeat 
Droteins nroteins . 


BL00678 9.67 9.308e-ll 
144 -155 


910 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 
BINDING NU. 


PD01066 19.43 2.800e- 
30 48-87 


912 


BL01104 


RibosomaJL protein L13e 
proteins. 


BL01104C 15.14 6.000e- 
09 364-392 


922 


3L0Q67B 


Trp-Asp (WD) repeat 
proteins proteins . 


BL00678 9.67 3.842e-09 
500-511 


923 


PR00320 


G- PROTEIN BETA WD- 4 0 
REPEAT SIGNATURE 


PR00320C 13.01 2.500e- 
09 323-338 PRO032OC 
13.01 5.500e-09 187- 
202 


924 


PD02181 


PROTOCHLOROPHYLLIDE 
REDUCTASE PHOTOS YNT . 


PD02181D 12.85 8.609e- 
09 36-64 


926 


BL00019 


Act inin- type actin- 
binding domain proteins. 


BL00019C 14.66 7.453e- 
25 108-144 BL00019B 
13.34 6.510C-11 61-84 
BL00019D 15.33 9.338e- 
11 205-235 BL00019A 
12.56 2.373e-10 34-45 


928 


BL00678 


Trp-Asp (WD) repeat j 


BL00678 9.67 9.308e-ll 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 






proteins proteins. 

* 


273-284 BL00678 9.67 
1.600e-10 314-325 

fcJijUU o / a y . b / / . ouue-iu 
360-371 BL00678 9.67 
8.579e-09 206-217 


929 


BL00518 


Zinc finger, C3HC4 type 
(RING finger}, proteins. 


BL00518 12.23 1.857e- 
10 137-146 


930 


BL01085 


Ribulose-phosphate 3- 

on imp R P» fnm{ lvr 

proteins . 


BL01085D 16.55 4.600e- 

xjii — led DiiUiuo^tJ 
10.15 S.680e-22 30-52 

OilwxUO J& J- O .Of O . D / OC 

20 172-202 BL01085C 


931 


BL01085 


Ribulose -phosphate 3- 

pnimpraco f*ssmnl\,r 

proteins. 


BLC1085D 16.55 4.600e- 

10.15 5.680e-22 30-52 
BL01085E 18.87 8.676e- 

21.81 2.038e-14 66-97 


J J w 


ruju juj. 


OPDPST MTTO<*»T "D 
t .\U i m_ U KLrCiKl l1U£>V-l-ii£ 

CALCIUM-BI. 


fJJUUJUlA 1U . 24 b . 4UUe- 
09 1S0-171 


93 6 


PF00168 


C2 domain proteins. 


PF00168C 27.49 4.000e- 
12 3 J 6 -362 


93 7 


BL00415 


Synapsins proteins. 


BL00415N 4.29 9.519e- 

in c Ad 
x u y 


94 0 


PR00862 


PROLYL OLIG0PEPTIDASE 
SIGNATURE 


PR00862D 16.17 4.086e- 

no /~ *3 a * 
OS bJ-B* 


OA C 




Krifx raetnyiLrduSierase 
tnnA family proteins. 


BLOX^JUB 11 „ ©2 2. J/je- 
09 407-420 


ft 




rflorooi es iters / 
diacylglycerol binding 
domain proteins . 


BL004 /9xi 12.57 V . 429e- 
18 52-68 BL00479A 
19.86 2.200e-13 26-49 




BitUUO / S 


irp-Asp iwoj repeat 
proteins proteins . 


BL«00678 9.67 1 .474e-09 

100-111 


Jill 


ot"\ a till 


PKvTTbXN OX I DOREDUC 1 ASE 
NAD INTERGENIC RE. 


FD01J11A 30.23 5 . 909e- 
10 66-111 




rCOUO z>± 


n ia \ cix 30 Known as iiK— 
C/Ttk) domain proteins . 


12 47-60 


956 


PF00651 


BTB (also known as BR- 
\_ / 1 u/w / aoniain proteins . 


PF00651 15.00 3.250e- 

to d*7_cn 
X £. % 1 D U 


957 


BL00379 


CDP- alcohol 

^JlUJijjtJiicx L J.uy i LXaJl>> x, C x. aacs 

proteins . 


BL0D379 24.64 1.610e- 

111— 14ft 


959 




fiTP»hi nrii ncr nn^l pat" 
vx 1 iuj.ii y uu^icui 

protein ran proteins. 


10 31-75 


960 


BL01115 


GTP-binding nuclear 
protein ran proteins . 


BL01115A 10.22 3.438e- 
14 110-154 


962 


BL00061 


Short -chain 

dehydr og'ena ses/reductase 
s family proteins . 


BL00061B 25.79 6.586e- 
13 198-236 


963 


PR00502 


MUTT DOMAIN SIGNATURE 


PR00502A 15.06 8.200e- 

in oi n_ one 


966 I 


PR00308 


TYPE I ANTIFREEZE 
PROTEIN SIGNATURE 


PR00308A 5.90 7.035e- 
09 55-70 


967 


DM01206 


CORONAVIRUS NUCLEOCAPSID 
PROTEIN. 


DM01206B 10.69 1.286e- 
12 104-124 DM01206B 

Tfi cr 9QQo.n O"^— A 
X u • D J 3 , Zjjc - 1 x. Z J - H. J> 

DM01206B 10.69 8 . 274e- 
10 73-93 DM01206B 
10.69 3.962e-09 108- 
128 DM01206B 10-69 i 
5.671e-09 38-58 


969 


PF01008 


Initiation factor 2 
subunit . 


PF01008B 25.59 4.724e- 
31 417-460 PF01O08C 
12.25 5.333e-l8 506- 
526 PF01008A 20.14 
5.87Se-15 369-390 
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SEQ ID NO: 


ACCESS IOW 
NO. 


DESCRIPTION 


RESULTS* 


970 


BL01277 


Ribonuclease PH 
proteins . 


BL01277C 10.18 7.648e- 
10 112-143 BL01277A 
17.39 9.806e-10 40-78 


975 


BLOH59 


WW/rsp5/WWP domain 
proteins . 


BL01159 13.85 3.605e- 
12 130-145 BL01159 
13.85 4.l22e-10 171- 
186 


977 


PF00791 


Domain present in ZO-1 
and Unc5-like netrin 
receptors . 


PF00791C 20.98 2.235e- 
09 55-94 


978 


BL01167 j 


Ribosomal protein L17 
proteins . 


BL01167B 20.66 8.258e- 
19 88-127 


i 979 


BL00478 


LIM domain proteins. 


BL0047BB 14.79 9.357e- 
13 33-48 BL00478B 
14.79 7.250e-12 98-113 


980 


PR00312 


CALSEQUESTRIN SIGNATURE 


PR00312E 8.32 3.423e- 
36 169-199 PR00312I 
15.78 5-286e-35 332- 
361 PR00312F 15.06 
5.865e-35 199-229 
PR00312H 13.31 8.313e- 
35 263-291 PR00312J 
13.73 5.688e-34 363- 
392 PR00312D 9.43 
2,636e-33 128-158 
PR00312C 15.14 8.839e- 
33 92-122 PR00312B 
15.08 8.941e-33 62-92 
PR00312G 11.11 6.657e- 
32 230-258 PR00312A 
11.70 6.914e-27 35-59 


981 


PF00992 


Troponin . 


PF00992A 16.67 8 . 816e- 
09 414-449 


982 


PRO 02 9 9 


ALPHA CRYSTALL.IN 
SIGNATURE 


PR00299F 13.20 2.367e- 
09 127-149 


9B3 


BL01150 


Respiratory- chain NADH 
dehydrogenase 20 Kd 
subunit proteins. 


BL01150B 17.16 l.OOOe- 
40 156-202 BL01150A 
14.10 8.200e-39 100- 
138 


986 

* 


BL0079S 


Involucrin proteins. 


BL00795C 17.06 7.211e- 
14 4-49 BL00795C 
17.06 1.778e-ll 1-46 
BL00795C 17.06 3.407e- 
10 14-59 BL00795C 
17.06 7.802e-10 2-47 
BL00795C 17.06 8.640e- 
10 19-64 BL00795C 
17.06 7.400e-09 11-56 
BL00795C 17.06 7.800e- 
09 3-48 


987 


3L0093 9 


Ribosomal protein Lie 
proteins . 


BL00939F 17.27 5.393e- 
09 810-840 


| 988 


j PR00452 


SH3 DOMAIN SIGNATURE 


PR00452B 11.65 6.538e- 
11 525-541 


989 


PRO 04 52 


SH3 DOMAIN SIGNATURE 


PR004S2B 11.65 6.538e- 
11 497-513 


994 


BL00027 


• Horaeobox ' doma in 
proteins. 


BL00027 26.43 2.500e- 
25 146-189 


997 


BL013 04 


ubiH/COQ.6 monooxygenase 
family proteins. 


BL01304A 8.05 3.893e- 
11 65-79 


998 


DM01767 


5 TRANSMITTER DOMAIN . 


DM01767B 10.07 7.868e- 
09 22-39 


| 1000 


PR00926 


MITOCHONDRIAL CARRIER 
PROTEIN SIGNATURE 


PR00926C 16.07 1.750e- 
24 73-94 PR00926D 
10.53 3.250e-23 126- 
145 PR00926F 17.75 
6.211e-23 217-240 
PR00926E 11.70 6.625e- 
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ACCESSION 
WO. 


DESCRIPTION 


RESULTS* 








20 174-193 PR00926B 
16.07 2.125e-18 24-39 
PR00926A 10.41 l.OOCe- 
jlj jij *j rivvi/j^or 
17.75 5.565e-09 120- 
143 


1005 


BL004O6 


Actios proteins. 


BL00406B 5.47 l.OOOe- 
40 88-143 BL00406C 
6.75 1.000e-40 147-202 
rJiiUUi ut?u x^S.jO j . /uue — 
40 270-325 BL00406E 
8.44 7.375e-38 327-377 
BL00406A 9.95 3.348e- 
29 11-46 


1006 


BL00406 


Actirxs proteins. 


BL00406B 5.47 l.OOOe- 
40 88-143 BL00406C 
6.75 1.000e-40 147-202 
BL00406E 8.44 l.OOOe- 
35 248-298 BL00406A 
9.95 3.348e-29 11-46 


IUUJ 


rKUUiU4 


1 -A- JL L»iit.iji> COMPJjEA 
f POLYPEPTIDE 1 
! ( CHAP ERONE ) SIGNATURE 


PR00304D 11.04 8.714e- 
22 384-407 PR00304C 
8.69 4.667e-20 98-118 
FR003C4B 11.60 7.577e- 
19 68-87 PR00304A 
9 . ZD J . 382e- lo 46-63 
PR00304E 7.79 6.870e- 
13 41o- 4 JJ. 


1009 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 
BINDING NU. 


PD01066 19.43 2.929e- 
32 9-48 


1011 


PD01066 


rM\/\min tilt it t ilv a r^'vm.^^^ 

PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 
BINDING NU. 


BBBV ^B, IB — ^ * JB> ^ «HB BK J"* ^B, «B. 

PD01066 j.9.43 2.929e- 
32 58-107 


lUlz 


Bi_>0 Obi B 


V7 BK ^BK ^BB ^BV A B«B. | -» — J^l TT T >t BBS K BK.^ - 

Zxnc ringer, C3HC4 type 
(RING finger), proteins. 


BL00518 12.23 6.143e- 
10 64-73 


X U J. t> 


rDUllbo 


PROTEIN ALANYL. 


PDOllooii lZ.Oo i.oooe — 
11 174-194 


1018 


PD00930 


PROTEIN GTPASE DOMAIN 
ACTIVATION . 


PD00930B 33.72 1.391e- 
32 261-302 PD00930A 
. bid y . D3Ue-<;4; xd / — 
183 


v & & 




rnospnoyxycerdce mu case 
family phosphohistidirie 


DJuUUl /3A iJ.'ii D ■ JL / 3 c- 

12 6-26 BL00175C 

53 75" ft 0f»?**-lO "79-3.11 


1025 


PR00305 


14-3-3 PROTEIN ZETA 
SIGNATURE 


PR00305D 16.34 1.439e- 
10 158-185 


1026 


3L003S3 


HMG1/2 proteins. 


BL00353B 11.47 2.436e- 

-L O £. J O O O OUUwJQJ v> 

14.83 8.844e-ll 288- 
335 


1028 


BIi00183 


TTV-j-j oil i ttn-poniucratincf 

enzymes proteins. 


BL001B3 28 97 1 310e- 1 
33 43-91 


1033 


°F0058 0 


LTv3rD/RRP hplieaiP 1 

»-* V O. *-/ / 4VU4 A iC -X A- \~ CI C* . 


PF005B0A 13 37 4 720e*- 
09 111-133 


1034 




DEHALOGENASB/EPOXIDE 
HYDROLASE FAMILY 
SIGNATURE 


PPOfJAl^P m 7ft T iTqp. 
XT Av V IOC X _> . /O J , 4^7C 

09 154-171 


1037 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 
BINDING NU. 


PD0IO66 19.43 9.657e- 
09 5-44 


1038 


PD01796 


PROTEIN TRANSMEMBRANE 
COBALT ZINC CADMIU. 


PD01796 15.01 4.259e- 
11 55-82 


1039 


BL00299 


Ubiquitin domain 
proteins . 


BL00299 28.84 9.036e- 
09 17-69 


j 1040 


PR00970 


ARGININE ADP- 

RI BOS YLTRANSFERASE 


PR00970A 17.73 6.143e T 
20 56-78 PR0097OD 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 






SIGNATURE 


9.96 2.l54e-l8 154-171 
PR00970F 12.30 l.OOOe- 
16 224-241 PR00970G 

9.97 9.229e-15 242-258 
PR00970B 16.37 1.290e- 
13 86-105 PR00970C 
11. OS 1.643e-ll 115- 
130 PR00970E 11.23 
9.820e-ll 202-218 


1042 


BL00678 


Trp-Asp (WD) repeat 
proteins proteins. 


BL00678 9.67 2.200e-10 
243-254 


1043 


PR0004 8 


C2H2-TYPE 2INC FINGER 
SIGNATURE 


PR00048A 10.52 6.786e- 
13 114-128 PR0004BA 
10.52 1.000e-09 172- 
186 


1045 


BL00615 


C-type lectin domain 
proteins . 


BL00615A 16.68 1.720e- 
11 218-236 BL00615B 
12.25 1.857e-10 317- 
331 


1046 


BL01092 


Adenylate cyclases 
class- 1 proteins. 


BL01092N 13.54 8.924e- 
10 3-40 


104 7 


BL01216 


ATP-citrate lyase / 
succinyl-CoA ligases 
family proteins . 


BL01216D 21.75 4.316e- 
28 314-344 BL01216A 
13.91 1.000e-10 97-112 


1049 


DM00031 


IMMUNOGLOBULIN V REGION. 


-DM00031B 15.41 7.618e- 
12 102-136 


1050 

A W -J V 


BL01073 


Ribosotnal protein L24e 
proteins . 


BL01073 24.30 l.OOOe- 
40 12-62 


A w 


BL00S71 


Amidases proteins. 


BL00571 25.69 S.875e- 
31 160-212 


-X v J J 


B2jO0O3O 


Eukarvotic RNA-binding 
region RNP-1 proteins. 


BL00030A 14.39 5.235e- 
11 98-117 BL00030B 
7.03 4.316e-09 137-147 


1058 

JLv JO 


BL00223 


Anneocins repeat proteins 
domain proteins . 


BL00223C 24.79 8.754e- 
23 262-317 BL00223A 
15.59 9.478e-14 46-80 
BL00223A 15.59 5.557e- 
11 118-152 


1060 


BL00027 

■ 


1 Homeobox ' domain 
proteins. 


BL00027 26.43 3.455e- 
35 158-201 


1064 


BI*00455 


Putative AMP-binding 
domain proteins . 


BL0045S 13.31 6.211e- 
13 280-296 


1065 


PR00019 


L3UC1NE-RICH REPEAT 
SIGNATURE 


PR00019A 11.19 2.000e- 
09 115-129 PR00019B 
11.36 3.B80e-09 87-101 


1066 


PR00326 


GTP1/OBG GTP- BINDING 
PROTEIN FAMILY SIGNATURE 


PR00326A 8.75 4.600e- 
16 151-172 PR00326C 
9.79 1.290e-14" 200-216 
PR00326B 16.74 8.548e- 
14 172-191 PR00326D 
19.09 1.257e-13 217- 
236 


1071 


PD02870 


RECEPTOR INTERLEUKIN-1 
PRECURSOR . 


PD02870B 18.83 8.518e- 
11 164-197 


1072 


PF008S6 


SET domain proteins. 


PF00856A 26.14 5.976e- 
09 350-387 


1075 


BL01009 


Extracellular proteins 
SCP/Tpx- 1 /Ag5/PR« 1 /Sc7 
proteins . 


BL01009D 14.19 4.300e- 
20 127-148 BL01009A 
13.75 6.586e-13 57-75 
BL01009E 13.50 1.439e- 
11 159-175 


1077 


PR00724 


CARBOXYPEPTIBASE C 
SERINE PROTEASE (S10) 
FAMILY SIGNATURE 


PR00724A 10.91 l.OOOe- 
08 366-379 


1078 


BL00215 


Mitochondrial energy 
transfer proteins. 


BL00215A 15.82 1 . OOOe- 
12 170-195 BL00215A 
15.82 7.S29e-10 79-104 


1079 


BL00678 


I Trp-Asp (WD) repeat 


j BL00678 9.67 4.3l6e-09 
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SEQ ID NO: 


ACCESSION 


DESCRIPTION 


RESULTS* 






proteins proteins. 


298-309 


1081 


BL0O326 


Tropomyosins proteins. 


BL00326A 14.01 7.398e- 
10 23-57 


1094 


BI .00460 


Glutathione peroxidases 
beienocysceine pr oceins . 


BL.004 60A 28.67 3.204e- 
lo 57-92 BJjUU4oOB 

9.73 6.400e-13 100-118 
BL00460D 16.89 9.143e- 
12 162-182 BL00460C 
14.35 5.500e-09 133- 
loo 


1095 


PD02811 


PROTEIN PEPTIDE 
REDUCTASE MG448 PILB 


PD02811A 20.67 3.017e- 
22 67-105 PD02811B 
i/.u/ ^ . joje-^j. Ho- 
151 PD02811C 13.25 

C CQCo. 1 1 1 C /I _ 1 C ^ 
3 . DJDC-lJ 104-10/ 


1096 


PD02811 


PROTEIN PEPTIDE 
REDUCTASE MG448 PILB 
FIMBRIA TRAN. 


PD02811A 20.67 3.017e- 
22 60-98 PD02811B 
17.07 2.263e-21 111- 
144 PD02811C 13 .25 
5.69oe-13 147-160 


1097 


BL00479 


Phorbol esters / 
diacylglycerol binding 
domain proteins. 


BI>00479B 12.57 6 . 143e- 
09 200-216 


1105 


PF00881 


Nitroreductase tamily. 


PF00881A 27.15 9.229e- 
13 111-147 


1109 


PR00449 


TRANSFORMING PROTEIN P21 
RAS SIGNATURE 


PR00449A 13.20 3.077e- 
10 15-37 PR00449E 
13.50 1.857e-09 185- 
208 PR00449D 10.79 
8.364e-09 131-145 


1115 


PRO 04 05 


HIV REV INTERACTING 
PROTEIN SIGNATURE 

• 


PR00405B 11.83 5.737e- 
20 42-60 PR00405A 
17.71 2.703e-17 23-43 
.PR00405C 19.41 6.902e- 
1U oJ-oo 


1116 


BL0035S 


HMG14 and HMG17 
proteins . 


BL0035S 5.97 2.528e-25 
20-51 


1117 


BL0035S 


HMG14 and HMG17 
proteins . 


BI.00355 5.97 2.S28e-25 
20-51 


1120 


BliOOlOv 


Protein Kinases ATP- 
binding region proteins. 


BIjOOIO/B 13.31 4.S57e- 
10 290-306 


1123 


PR D 04 12 


EPOXIDE HYDROLASE 


PR00412F 18. 76 9.526e- 
12 3U1-324 


1125 


PR00186 


HEMERYTHRIN SIGNATURE 


PR00186A 13.62 2.800e- 
oy o /— lui 


1129 


BL00170 


Cyc lophi 1 in - type 
pep ciuyi -prolyl cis- 
trans isomerase 
signatur. 


BL00170C 18.49 3.077e- 

J3 o* — l^y xSJjUUX/Ut* 

20.97 6.838e-25 37-77 
BIi00170A 17.08 3.4S5e- 
15 10-37 


1131 


BL00636 


Nt-dnaJ domain proteins. 


BL00636A 8.07 5.304e- 
15 29-46 BL00636B 
15.11 1.360e-14 59-80 


X JL O j£ 


DIjU v d / o 


JL rp -/isp vvii/j repeau 
proteins proteins. 


rj T ft ft <T *7 Q Q fi"7 C Ollo-OQ 

29-40 | 


1133 


BL0067B 


Trp-Asp (WD) repeat 
proteins proteins. 


BL00678 9.67 6.211e-09 
29-40 




RT.n ft oqn 


v.i.dLiirin daap-or 
complexes medium chain 
proteins . 


qt nnoQfir n a '7B a t 7Cp. 
jaJ-tU u _7 1? \J*_ 10 . / □ •♦..i.foe — 

38 235-269 BL00990A 

21.44 4. 316e-36 94-132 

BL00990B 20.15 2.125e- 

27 157rl87 BL00990D 

16.13 5.320e-18 403- 

422 


1137 


PR0O314 


CLATHRIN COAT ASSEMBLY | 
PROTEIN SIGNATURE 


PR00314B 15.68 8.000e- 
34 100-128 PR00314D 
9.66 3.531e-33 233-261 
PR00314C 16.05 8.909e- 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








32 159-188 PR00314A 
14.53 1.281e-22 13-34 


1139 


BL0111S 


GTP- binding nuclear 
protein, ran proteins. 


BL01115A 10.22 6.364e- 
13 13-57 


1141 


BL00107 


Protein Kinases ATP- 
binding region proteins. 

■ 


BL00107A 18.39 4.00Ge- 
19 451-482 BL00107B 
13.31 3.077e-12 519- 
535 


1148 


PR00685 


TRANSCRIPTION INITIATION 
FACTOR I IB SIGNATURE 


PR00685A 13.62 4.676e- 
09 21-42 


11SS 


PD01652 


RECEPTOR CELL NK 
GLYCOPROTEIN IMMUNOGLOB . 


PD01652B 8.50 9.396e- 
10 522-574 PD01652B 
8.50 9.463e-10 740-792 


1157 


PD02894 

* 


HYDROLASE N4- PRECURSOR 
PROTEIN SIGNAL BE. 


PD02894A 21.96 7 . 873e- 
28 81-127 PD02894B 
13.93 1.188e-27 178- 
211 


1159 


BL00623 


GMC oxidoreductases 
proteins . 


BL00623E 15.00 3.531e- 
20 391-414 BL00623C 
10.86 4.240e-20 155- 
176 


1161 


PD01937 


DNA PROTEIN POLYMERASE 
ENDONUCLEAS £ DNA- . 


PD0193 7A 6.68 3.47Se- 
09 330-341 


1162 


PD01937 


DNA PROTEIN POLYMERASE 
ENDONUCLEAS E DNA- . 


PD01937A 6.68 3.47Se- 
09 221-232 


1163 


PR00624 


HI STONE H5 SIGNATURE 


PR00624D 11.94 7.455e- 
10 214-239 PR00624D 
11.94 1.96le-09 312- 
337 


1167 


BL00226 


Intermediate filaments 
proteins . 


BL00226B 23.86 7.384e- 
09 302-350 


1177 


BL01032 


Protein phosphatase 2C 
proteins . 


BL01032G 8.33 1.422e- 
10 34-48 


1178 


PR00320 


G- PROTEIN BETA WD-40 
REPEAT SIGNATURE 


PR00320A 16.74 1 . 794e- 
10 205-220 PR00320C 
13.01 7.840e-10 205- 
220 PR00320B 12.19 
B.457e-10 35-50 
PR00320A 16.74 7.146e- 
09 35-50 PR00320B 
12.19 9.100e-09 79-94 


1180 


PR00454 


ETS DOMAIN SIGNATURE 


PR00454D 10.89 4.150e- 
19 765-78.4 


1181 


BL00291 


Prion protein. 


BL00291A 4.49 8.962e- 
11 152-107 


1184 


BIi00720 


Guanine-nucleotide 
dissociation stimulators 
CDC25 family sign. 


BL00720B 16.57 4.103e- 
18 1089-1113 

• 


1185 


BLC0215 


Mitochondrial energy 
transfer proteinB . 


BL00215A 15.82 4.553e- 
13 204-229 BL00215A 
15.82 1.429e-12 11-36 
BL00215A 15.82 9.809e- 
11 104-129 


1187 


BL00983 


Ly-6 / u-PAR domain 
proteins . 


BL00983C 12.69 2.761e- 
10 77-93 


1188 


BL00878 


Orn/DAP/Arg 

decarboxylases family 2 
pyridoxal-P attachment 
si. 


BL00878B 10.95 6-000e- 
16 189-204 BL00878C 
17.74 8.435e-15 225- 
245 BL00878F 19.67 
3.625e-13 379-402 
BL00878D 16.56 1.621e- 
09 270-289 


1191 


PD02939 


PROTEIN GLUTATHIONE 
SYNTHETASE SY. 


PD02939B 10.10 2.723e- 
12 203-220 PD02939C 
20.01 l.OOOe-ll 224- 
252 


1193 


PRO 03 4 5 


STATHMIN FAMILY 
SIGNATURE 


PR0034SB 7.12 2.800e- 
28 72-101 PR00345B 



228 



0153312A1 I > 



WO 01/53312 



PCT/US00/34263 



SEQ NO: 
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NO. 


DESCRIPTION 


RESULTS * 








8.54 7.652e-28 149-174 








PR00345C 4.54 9.100e- 








28 101-125 PR00345D 








10.97 1.964e-24 125- 








149 PR00345A 13.46 








3.o4t»e-io 4 J-b2 


1194 


PR00345 


STATHMIN FAMILY 

O T /TKTA TT TO T 


PR00345B 7.12 2.800e- 

8.54 7.652e-28 185-210 

28 137-161 PR00345D 
10.97 1.9645-24 161- 

5.645e-16 79-98 








rt uuy?DD i. / . j / j. . j.^i/e- 
13 224-264 


1196 


BL00932 


Bacterial -type phytoene 
dehydrogenase proteins. 


BL00982A 18.41 6.73Be- 
11 15-47 


1197 


BL01298 


Di hydrodipi col ina te 
reductase proteins. 


BL01298A 13.90 5.959c- 
09 51-73 


1203 


BL00061 


Short -chain 

dehydrogenases /reductase 
s family proteins . 


BL00061B 25.79 l.OOOe- 
14 152-190 


1204 


PR00118 


BETA- LACTAMASE CLASS A 
SIGNATURE 


PR00118F 16.42 9.386e- 
09 213-229 


1206 


BL01183 

- 


ubiE/C0Q5 

methyl transferase family 
proteins . 


BL01183B 21.31 1.429e- 
37 184-229 BL01183D 
27.71 8.535e-27 264- 
307 BL01183A 13.25 
3.250e-23 51-73 
BL01183C 10.77 5.295e- 
09 246-258 


1208 


BL00979 


G -protein coupled 
receptors family 3 
proteins . 


BL00979L 20.63 2.4 85e- 
09 105-146 


1209 


PFC0023 


Ank repeat proteins. 


PF00023A 16.03 4.857e- 
11 49-65 PF00023B 
14.20 1.8l8e-09 45-55 


1212 


PR00 048 


C2H2-TYPE ZINC FINGER 
SIGNATURE 


PR00048A 10.52 7.750e- 
14 227-241 PR00048A 
10.52 4.316e-ll 199- 
213 


1213 


PR0O450 


RECOVERIN FAMILY 
SIGNATURE 


PR00450C 12.22 1.720e- 
10 20-42 PR00450C 
12.22 3.506e-09 56-78 
PR00450D 16.58 6.769e- 
09 44-64 


1216 


BL00412 


Neuromodul in (GAP - 4 3 ) 
proteins. 


BL00412D 16.54 S.S98e~ 
10 179-230 


1219 


PR00456 


RIBOSOMAL PROTEIN P2 
SIGNATURE 


PR00456E 3.06 £>.348e- 
11 249-264 


1222 


PD00066 


PROTEIN ZINC- FINGER 
METAL- BIND I . 


PD00066 13.92 7.231e- 
15 295-308 PD00066 
13.92 7 .^ile-lj 4Ub- 
419 PD00066 13.92 
2.286e-12 378-391 
PD00066 13.92 7.857e- 
1? 4"^4-447 PDQD066 

13.92 3.348e-ll 350- 
363 


1223 


BL50 0S8 


G-protein gamma subunit 
profile . 


BL50058 27.23 l.OOOe- 
40 13-61 


1226 


BL00412 


Neuromodulin (GAP-43) 
proteins . 


BL00412D 16.54 8.439e- 
09 279-330 


1227 


BL00437 


Catalase proximal heme- 
ligand proteins. 


BL0O437A 18.82 l.OOOe- 
40 49-101 BL00437B 
16.28 1.000e-40 114- 
168 BL00437C 21.86 
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NO. 
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RESULTS* 








1.000e-40 190-239 
BL00437D 25.72 l.OOOe- 
40 248-301 BL00437E 
23.95 1.000e-40 327- 
379 


1230 


BL01160 


Kinesin light chain 
repeat proteins. 


BL01160B 19.54 8.297e- 
10 6-60 


1231 


PR0073S 


GLYCOSYL. HYDROLASE 
FAMILY 8 SIGNATURE 


PR00735A 11.19 6.857e- 
09 391-405 


1232 


PR00497 


NEUTROPHIL CYTOSOL 
FACTOR P4 0 SIGNATURE 


— — _ ^ — mm^ ^ m^ mm -m* 

PR00497A 6.92 5.553e- ; 
10 158-176 


1233 


PR00497 


NEUTROPHIL CYTOSOL | 
FACTOR P40 SIGNATURE 


PR00497A 6.92 5.553e- 
10 158-176 


1235 


BL00866 


Carbamoyl -phosphate 
synthase subdoma in 
proteins . 


BL00866B 36.29 2.776e- 
09 75-121 


1237 


BL00027 


• Home obox 1 doma i n 
proteins . 


BL00027 26.43 1.818e- 
21 36-79 


1243 


PR00403 


WW DOMAIN SIGNATURE 


PR00403B 12.19 1.104e- 
11 10-25 


1246 


PD01168 


SYNTHETASE LIGASE 
PROTEIN ALANYL. 


PD01168L 9.47 2.83 7e- 
10 31-46 PD01168L 
9.47 4.490e-10 174-189 
PD01168L 9.47 7.612e- 
10 183-198 


1249 


BL00018 


EF-hand calcium-binding 
domain proteins. 


BL00018 7.41 2.800e-10 
183-196 


1254 


BL00133 


Ubi qui tin - con} uga t ing 
enzymes proteins . 


BL00183 28.97 2.440e- 
36 96-144 


1255 


BL01115 


GTP- binding nuclear 
protein ran proteins. 


BLOillSA 10.22 5.670e- 
11 8-52 


1256 


BL00373 


Phosphor ibosylg lyc inaraid 
e formyl transferase 
proteins . 


BL00373C 10.35 3.348e- 
12 143-156 


1258 


PR00011 


TYPE III EGF-LIKE 
SIGNATURE 


PR00011B 13.08 3.217e- 
10 174-193 


1259 


BL00518 


Zinc finger, C3IIC4 type 
(RING finger) , proteins . 


BL00518 12.23 8.286e- 
10 31-40 


1261 


PR00070 


DIHYDROFOLATE REDUCTASE 
SIGNATURE 


PR00070D 11.63 l.OOOe- 
15 112-127 PR00070C 
13.09 9.500e-lS 51-63 
PR00070A 12.92 5.500e- 
12 16-27 


1262 


BL00462 


Gamma - 

glutamyl transpeptidase 
proteins. 


BL00462A 20.89 6.438e- 
24 140-183 BL00462B 
17.88 5.500e-20 230- 
267 BL00462C 27.41 . 
2.023e-ll 292-347 


1263 


BL00038 


My c-- type, 'helix- loop- 
helix 1 dimerization 
domain proteins . 


BLO0O38B 16.97 9.455e- 
11 62-83 


1264 


BL01115 


GTP-binding nuclear 
protein ran proteins. 


BL01115A 10.22 5.670e- 
11 17-61 "\ 


1266 


PR00837 


ALLERGEN V5/TPX-1 FAMILY 
SIGNATURE 


PR00B37C 17.21 2.714e- 

18 165-182 PR00837A 

14.77 4.512e-12 86-105 

PR00837D 11.12 7.577e- 
-> mi .lie 


1269 


PR00449 


TRANSFORMING PROTEIN P21 
RAS SIGNATURE 


PR00449C 17.27 9.308e- 
22 40-63 PR00449E 
13.50 1.000e-16 137- 
160 PR00449D 10.79 
3.520e-ll 102-116 


1270 


BL00276 


Channel forming colicins 
proteins. 


BL00276A 8.87 l.SOOe- 
09 17-29 


1275 


PD02327 


GLYCOPROTEIN ANTIGEN 
PRECURSOR IMMUNOGLO. 


PD02327C 15.47 9.769e- 
09 228-243 


1276 


PRO 04 12 


EPOXIDE HYDROLASE 


PR00412B 12.59 7.894e- 
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NO. 


DESCRIPTION 
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SIGNATURE 


\ 12 119-135 PR00412C 
11.30 1.857e-ll 165- 
179 PR00412A 13.23 
3.400e-ll 100-119 


1277 


PF00756 


Putative esterase. 


j PF00756C 14.12 9.538e- 
10 127-157 


1279 


BL00134 


Serine proteases, 
trypsin family, 
histidine proteins. 


BL00134A 11.96 9.325e- 
13 128-145 


! 1280 


BL01220 


Phosphatidylethanolamine 
-binding protein family 
proteins . 


BL01220C 14.75 9.348e- 
15 248-276 


1285 


BL00518 


Zinc finger, C3KC4 type 
{RING finger) , proteins . 


BL00518 12.23 2.286e~ 
10 33-42 


1287 


PF00791 


Domain pnesent in ZO-1 
and UncS-like netrin 
receptors . 


PF00791B 28.49 7.182e- 
11 288-343 


1292 


PR00802 


SERUM ALBUMIN FAMILY 
SIGNATURE 


PR00802B 16.51 1.610e- 
10 81-105 


1297 


PRO 07 16 


M- PHASE INDUCER 
PHOSPHATASE SIGNATURE 


PR00716C 17.65 5.696e- 
09 23-44 


1298 


BL00478 


LIM. domain proteins. 


BL00478B 14.79 6.478e- 


1301 


BL00127 


Pancreatic ribonuclease 

luuiixy pxuLcxiu}. 


BLO0127C 31.49 3.571e- 
26.57 8.800e-28 23-68 




xrxtVJ UD J / 


I X IrC* -J DUrLDLJ AIM KTiUTi t 1 UK. 

SIGNATURE 


f rCU Ubj / £> ± J. . Z. f Tt . iJuC 

09 290-306 


1307 


BL00215 


Mitochondrial energy 


BL00215A 15.82 S.SOOe- 

15.82 1.000e-16 226- 
251 HL00215A 15 82 
2.658e-13 107-132 


) 1 "?Ofl 


PRO 0 fl 9 H 


SIGNATURE 


PR00898H 11 34 4 f»B2e- 
09 552-572 


1309 


PD00301 


PROTEIN REPEAT MUSCLE 
CALCIUM- B I . 


PD00301B 5.49 2.731e- 
09 390-401 


1310 


BL00983 


Ly-6 / u-PAR domain 
proteins. 


BL00983C 12.69 9.654e- 
13 73-89 BL00983B 
8.19 3.132e-09 12-22 


1313 


BL00194 


Thioredoxin family 


BL00194 12.16 1.900e- 
11 15-28 


1314 


BL00594 


Aromatic amino acids 
n^rrapase*; oroteins 


BL00594A 16.75 8.969e- 
10 53-97 


1316 


BL00134 


Serine proteases, 
trypsin family, 
histidine oroteins 


BL00134A 11.96 9.325e- 
13 128-145 


1320 


BL00783 


Ribosomal protein L13 
nroteina . 


BL00783C 22.43 6.559e- 
24 07-117 BL007B3A 
14.55 1.600e-19 8-33 
BL00783B 12.76 3.500e- 
12 74-86 


1327 


PF00514 


Armadillo/beta -catenin- 
like repeat proteins. 


PF00514A 31.30 7.268e- 
11 82-120 


1329 


BL0003 0 


Eukarvotic RNA-bindincr 
region RNP-1 proteins. 


BL00030A 14.39 6.294e- 
11 129-148 BL00030B 
7.03 4.789e-09 168-178 


1331 


PR00497 


NEUTROPHIL CYTOSOL 
FACTOR P40 SIGNATURE 


PR004 97A 6.92 7..239e- 
09 25-43 


1332 


PR00161 


NICKEL- DEPENDENT 
HYDROGENASE/B - TYPE 
CYTOCHROME SIGNATURE 


PR00161C 9.51 4.930e- 
09 317-337 


1333 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 
BINDING NU. 


PD01066 19.43 6.769e- 
33 10-49 


1336 


PR00700 


PROTEIN TYROSINE 
PHOSPHATASE SIGNATURE 


PR00700D 12.47 2.200e- 
09 262-281 


1337 


PR00700 


PROTEIN TYROSINE 


PRO070OD 12.47 2.200e- 
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DESCRIPTION 
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PHOSPHATASE SIGNATURE 


09 211-230 


1340 


PR00860 


VERTEBRATE 
METALLOTH I ONE IN 
SIGNATURE 


PR00860A 5.46 S.034e- 
13 5-18 


1341 


BLO0893 


mutT domain proteins . 


BL00893 18-99 6.750e- 
16 46-71 


1343 


BL01282 


BIR repeat proteins. 


BL01282B 30.49 S-974e- 
21 383-422 


1344 


DM00099 


4 Jew A55R REDUCTASE 
TERMINAL 
DIHYDROPTERI DINE . 


DM00099B 14.73 8.313e- , 
09 417-427 


1345 


BL00923 


Aspartate and glutamate 
racemases proteins . 


BL00923B 11.41 5.935e- 
10 135-146 


1348 


PF00651 


BTB (also known as BR- 
C/Ttk) domain proteins. 


PF00651 IS. 00 7.231e- 
13 44-S7 


1350 


PR00193 


MYOSIN HEAVY CHAIN 
SIGNATURE 


PR00193D 14.36 3.571e- 
32 416-445 PR00193C 
12.60 6.318e-31 179- 
207 PR00193B 11.69 
3.571e-24 133-159 
PR00193E 19.47 9.069e- 
22 470-499 PR00193A 
15.41 1.783e-20 77-97 


1352 


PR00447 


NATURAL RESISTANCE- 
ASSOCIATED MACROPHAGE 
PROTEIN SIGNATURE 


PR00447E 9.73 1.554e- 
15 299-319 PR00447D 
13.54 3.408e-15 200- 
224 PR00447A 12.73 
6.357e-ll 97-124 
PR0044 7G 6.69 9.877e- 
10 353-373 


1353 


BL00303 


S-10O/ICaBP type calcium 
binding protein . 


BL00303A 21.77 6.667e- 
26 45-82 BL00303B 
26.15 1.000e-24 93-130 


1355 


BL00039 


DEAD -box subfamily ATP- 
dependent helicases 
proteins . 


BL00039D 21.67 5.950e- 
29 375-421 BL00039A 
18.44 7.136e-29 99-138 
BL00039C 15.63 4.000e- 
18 225-249 BL00039B 
19.19 3.182e-14 141- 
167 


1357 


PF0O615 


Regulator of G protein 
signalling domain 
proteins . 


PF00615B 16.25 2.216e- 
12 84-101 PF00615C 
10.06 8.412e-12 162- 
176 


1360 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 
BINDING NU. 


PD01066 19.43 9.234e- 
29 10-49 


1361 


PR00925 


NONHISTONE CHROMOSOMAL 
PROTEIN HMG17 FAMILY 
SIGNATURE 


PR00925A 5.47 5.091e- 
18 14-29 PR00925B 
3.73 G.143e-14 29-42 
PR00925C 5.57 4.789e- 
12 53-64 PR00925D 
6.56 1.857e-10 76-87 


1362 


BL01272 

( 


Glucokinaae regulatory 
protexn family proteins. 


BL01272B 19.61 6.870e- 
30 136-171 BL01272C 
11.68 3.314e-25 249- 
274 BL01272A 6-49 

JL . A J JL.C J. O _7 — * — X X / 


1363 


BL01272 


GlucoJcinase regulatory 
protein family proteins. 


BL01272B 19:61 6.870e- 
30 113-148 BL01272C 
11.68 3.314e-25 226- 
251 BL01272A 6.49 
1.23le-18 76-94 


1364 


DM00179 


w KINASE ALPHA ADHESION i 
T-CELL. 


DM00179 13.97 5.304e- 
09 167-177 


1368 


PR00169 


POTASSIUM CHANNEL 
SIGNATURE 


PR00169A 16.77 1.592e- 
09 76-96 


1370 


PR00988 


URIDINE KINASE SIGNATURE 


PR00988A 6.39 1.794e- 
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NO. 
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10 1-19 1 


1371 


BL00242 


proteins . 


09 469-479 


1372 


PR00625 


DNAJ PROTEIN FAMILY 
SIGNATURE 


PR00625B 13.48 7.353e- 

X j ■* o o / ri\ UUDZ 

12.84 1.391e-l6 14-34 | 


1373 


BL00434 

• 


HSF-tVDe DNA-bincLina 
domain proteins . ' 


09 90-130 1 


1374 


PR00962 


LETHAL (2) GIANT LARVAE 
PROTEIN SIGNATURE 


09 505-526 j 


1375 


PD02475 


MUCIN EPITHELIAL TTJMOR- 
ASSOCIATE . 


10 1111-1150 J 


13 76 


PD01066 


PROTEIN ZTNT FTNRFR 
ZINC- FINGER METAL- 
BINDING NU 


pnfn A£C iq a ^ q cni-. — — j 
rUUiu Do X 3 . H J _7 . _» / J. C i 

32 24-63 


13 80 


BL00194 


Thioredoxin family 


BLC0194 12.16 8.333e- 


1381 


DM01970 


0 kw ZK632.12 YDR313C 


DM01970B 8.60 1.458e- | 

XZ> 11^ J-liJD 1 


1383 


BL00678 


Trp-Asp (WD) repeat 
proteins proteins. 


BL00678 9.67 7.600e-10 } 
243-254 | 


xo o*± 


DJjUU d / o 


irp-Asp (WD) repeat 
proteins proteins . 


BL006 /o 9.67 7.b00e-10 
271-282 | 


1385 


BL00303 


S-100/ICaBP type calcium 
.binding protein. 


BL00303B 26.15 6.203e- | 
10 95-132 J 


1386 


BL01160 


Kinesin light chain 
repeat proteins . 


BL01160B 19.54 S.042e- 
09 1574-1628 | 


1387 


BL00518 


Zinc finger, C3HC4 type 
{RING finger), proteins. 


BL00518 12.23 l.OOOe- 

11 52-61 j 


1389 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER KETAL- 
BINDING NU. 


PD01066 19.43 3.6O0e- j 
30 10-49 I 


1390 


PD01066 


PROTEIN ZINC FINGER 
ZINC-FINGER METAL- 
BINDING NU. 


PD01066 19.43 3.S12e- 
31 32-71 


1392 


PR00308 


TYPE I ANTIFREEZE 
PROTEIN SIGNATURE 


PR00306C 3.83 9.723e- 

10 127-137 [ 


1393 


PR00380 


KINESIN HEAVY CHAIN j 
SIGNATURE 


PR00380A 14.18 9.625e- j 
25 88-110 PR00380D j 
9.93 2.406e-20 304-326 
PR00380B 12.64 4.414e- j 
16 208-226 PR00380C 1 
13.18 6.538e-16 243- 1 
262 j 


1394 


PD00066 


PROTEIN ZINC- FINGER 
METAL -BIND I . 


PD00066 13.92 3.400e- I 
14 462-475 PD00066 I 
13.92 8.800e-14 348- I 
Jbl FPUOOob 13 1 
9.571e-12 405-418 • 

11 490-503 PD00066 

333 j 


139B 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 
BINDING NU. 


PD01066 19.43 6.786c- I 
32 10-49 j 


1400 


DM01206 


CORONAVIRUS NUCLEOCAPSID 

Jr X1H . 


DM01206B 10.69 7.038e- 
us aim - ^yu I 


1406 


PD00930 


PROTEIN GTFASE DOMAIN 
ACTIVATION. 


PD00930A 25.62 7.324e- 
15 363-389 J 


1407 


BL00030 


Eukaryotic RNA- binding 
region RNP-i proteins. 


BL00030A 14.39 7.500e- | 
10 457-476 | 


1408 


PRO 001 9 


LEUCINE -RICH REPEAT 
SIGNATURE 


PR00019A 11.19 9.550e- j 
11 179-193 PR00019A 
11.19 8.826e-l0 228- j 
242 PR00019B 11.36 j 
1.360e-09 199-213 
PR00019B 11.3 6 4.960e- | 
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09 176-190 


1409 


PR00510 


NEBULIN SIGNATURE 


PR00510A 9.09 4.150e- 
12 182-202 PR00510B 
12.96 8.767e-12 210- 
230 PR00510F 9.88 
8.172e-10 58-75 
PR00510D 9.21 2.367e- 
09 251-267 


1410 


PD00078 


REPEAT PROTEIN ANK 
NUCLEAR ANKYR. 


PD00078B 13.14 5.69€e- 
09 31-44 


1412 


BL00358 


Ribosoraal protein 1*5 
proteins . 


BL00358B 22.76 l.OOCe- 
40 57-103 BL003S8C 
13.75 6.087e-14 122- 
136 BL00358D 14.26 
5.500e-l3 143-158 
BL00358A 13.06 1.93le- 
11 33-44 


1414 


BL00282 


Kazal serine protease 
inhibitors family 
proteins . 


BL00282 16.88 7.338e- 
10 511-534 


1415 


BL00023 


Type II fibronectin 
collagen-binding domain 
proteins . 


BL00023 24.31 4.300e- 
29 40-77 


1417 


PRO 06 81 


RIBOSOMAL PROTEIN SI 
SIGNATURE 


PR00681G 12.54 2.149e- 
09 38-60 


1418 


DMO0973 


3 kw RESISTANCE BENOMYL 
YLL028W CYCLOHEXIMIDE . 


DM00973A 21.17 1.462e- 
09 171-208 


1419 


PR00319 


BETA G- PROTEIN 
(TRANSDUCIN) SIGNATURE 


PR00319B 11.47 1.571e- 
09 428-443 


1420 


PD01941 


TRANSMEMBRANE 
COTRANS PORTER SYMP. 


PD01941A 14.81 l.OOOe- 
40 142-196 PD01941B 
15.02 7.049e-30 400- 
447 PD01941E 15-92 
2 ,475e-20 817-864 
PD01941C 19.96 3.118s- 
19 488-543 PD01941D 
27.18 9.614e-18 641- 
690 PD01941F 28.52 
S.382e-15 1038-1093 


1422 


PRO 02 05 


CADHERIN SIGNATURE 


PR00205B 11.39 8.043e- 
12 199-217 


1423 


PR00209 


ALPHA/BETA GLIADIN 
FAMILY SIGNATURE 


PR00209B 4.88 6.318e- 
11 1009-1028 


1424 


BL50002 


Src homology 3 (SH3) 
domain proteins profile. 


BL50002A 14.19 8.200e- 
14 367-386 BLS0002A 
14.19 9.250e-12 298- 
317 BL5O002A 14.19 
4.462e-ll 208-227 
BL50002B 15.18 l.OOOe- 
09 244-258 


1425 


PFO0628 


PHD- finger . 


PF00628 15.84 3.045e- 
12 330-345 


1426 


PF00628 


PHD-f inger . 


PF00628 15.84 3.045e- 
12 377-392 


1427 


PR00405 


HIV REV INTERACTING 
PROTEIN SIGNATURE 


PR00405B 11.83 5.114e- 
16 281-299 PR00405A 
17.71 4.306e-14 262- 
282 


1428 


BL00Q39 


DEAD -box subfamily ATP- 
dependent heli cases 
proteins . 


BL00039D 21.67 5.219e- 
34 147-193 


1429 


PR00320 


G- PROTEIN BETA WD-40 
REPEAT SIGNATURE 


PR00320C 13.01 8.920e- 
10 577-592 


1430 


PRO 03 7 8 


INOSITOL PHOSPHATASE 
SIGNATURE 


PR00378D 16.86 7.563e- 
12 295-314 PR00378B 
13.80 8.650e-10 166- 
186 


1431 


PR 00 928 


GRAVES DISEASE CARRIER 


PR00928B 13.53 3.769e- 
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SEQ ID NO: 


ACCESSION 

NO. ! 


DESCRIPTION 


RESULTS* 






PROTEIN SIGNATURE 


10 103-124 


1433 


oJ-iOl 113 


txq uomain procexns . 


D JLl U J. J. JL O O -L O . O / • U 4 ! 

15 14-50 BL01113C 

13 18 7 0O0e-12 82-102 


1434 


PR00319 


BETA G- PROTEIN 
fTRANSDUCIN) SIGNATURE 


PR00319B 11.47 7 . 983e- 
10 135-150 


1436 


BL0003Q 


EuXaryotic RNA- binding 

T-f^cr i on RNP-1 r>Tnt"f» i ns 


BL00030A 14.39 l.OOOe- 
12 84-103 


1438 


BL00290 


Immunoglobulins and 
major histocompatibility 
complex proteins. 


BL00290B 13.17 2 . 500e- 
09 250-268 BL00290A 
20.89 4.000e-09 188- 
211 


1440 


PRO 08 06 


VINCULIN SIGNATURE 


PR00806B 4.28 4.960e- 
09 38-52 


1441 


PR00806 


VINCULIN SIGNATURE 


PR00806B 4.28 4.960e- 
09 88-102 


1444 


BL00422 


Granins proteins. 


BL00422D 19.48 l.OOOe- 
08 114-138 


1445 


PD01841 


PHOSPHORYLASE KINASE ' 
ALPHA MUSCL. 

i 

; 
i 

| 


PD01841A 21.71 l.OOOe- 
40 73-123 PD01841B 
14.35 1.00De-40 144- 
185 PD01841D 17.87 
l.OOOe-40 206-258 
PD01841F 13.36 l.OOOe- 
40 296-345 PD01841G 
24.26 l.OOOe-40 349- 
403 PD01841I 23.00 
l.OOOe-40 494-536 
PD01841J 14.94 l.OOOe- 
40 895-932 PD01841I* 
18.42 l.OOOe-40 1083- 
1125 PD01841E 18.60 
9.7l9e-38 258-296 
PD01841K 14.81 l.OOOe- 
35 1041-1071 PD01841H 
21.30 3.189e-31 435- 
472 PD01841C 13.78 

PD01841M 10.82 1.250e- 
oft inc.n Qa 


| 1446 


PF00816 


. H-NS his tone family . 


PF00816B 13.84 8.875e- 

Vy 127 1/ 


1447 


PR00048 


C2H2-TYPE ZINC FINGER 

O KjJMAI UK C. 


PR00048A 10.52 2.080e- 


1448 


DM00315 


072 RIBONUCLEASE 
INHIBITOR. 


DM00315D 18.40 7.393e- 
09 23-67 


1451 


BL00030 


Eukaryotic RNA-binding 
region RNP-1 proteins. 


BL00030B 7.03 2.800e- 
10 94-104 


1454 


DM01688 




nMfi l ftfin 13 44 7 14 6a- 
09 382-405 


1455 


PF00777 


biaiyicransrerase 
family. 


DT^^^fl77■7C , 1 H 60 2 929e- 
22 4-59 


1457 


BL00927 


Trehalase proteins. 


BL00927C 10.83 8.085e- 
\j j ** a — 


1460 


BL00545 


Aldose 1-epimerase 
proteins . 


BL00545C 11.28 7.353e- 

i / lb" lo« DJJ>J U -J * — ' »» 

10.20 2.071e-15 73-89 
BL00545B 13.10 3.942e- 
09 140-153 


1466 


PR00097 


AN TH RAN HATE SYNTHASE 
COMPONENT II SIGNATURE 


PR00097C 9.42 9.069e- 
09 233-245 


1472 


BL01129 


Hypothetical 
yabO/yceC/sfhB family 
proteins. 


BL01129E 13.25 5.250e- 
22 170-195 BL01129C 
25.56 9.526e-18 63-106 


1473 


BL00790 


Receptor tyrosine kinase 
class V proteins. 


BL00790I 20 -Ol 2.821e- 
09 2114-2145 


1475 


PF00686 


Starch binding domain 
proteins. 


PF00686A 13.45 9.100e- 
09 267-277 
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SEQ ID NO: 


ACCESSION 
NO- 


DESCRIPTION 


RESULTS* j 


1477 


PF00566 


Probable irabGAP domain 
proteins . 


PFO0566A 12.64 7.333e- 
10 466-476 


1478 


BIi00030 


Eukaryotic RNA- binding 
region RNP-1 proteins. 


BL0003 0B 7.03 9-400e- 
10 43-53 


1479 


DM00406 


GLIADIN . 


DM00406 7.73 8.541e-10 
292-305 


1480 


BL00290 


I ramunog 1 obiil ins and 
major histocompatibility 
complex proteins. 


BL00290B 13.17 2.385e- 
15 69-87 BL00290A 
20.89 5.091e-ll 12-35 


1481 


PR00150 


PHOS PHOENOLtP XRUVATE 
CARBOXYLASE SIGNATURE 


! PR00150F 10.45 9.039e- 
09 21-51 


1482 


PF00780 


Domain found in NIX1- 
like kinases, mouse 
citron and yeast ROM. 


PF00780I 14.69 4.825e- 
09 107-137 


1483 


BL01160 


Kinesin light chain 
repeat proteins . 


BL0116 0B 19.54 1 . I53e- 
09 108-162 


1485 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL - 
BINDING NU. 


PD01066 19.43 5.909e- 
25 17-56 


1466 


BL00107 


Protein kinases ATP- 
binding region proteins . 


BL00107B 13.31 1.529e- 
09 34-50 


1488 


BL00039 


DEAD- box subfamily ATP- 
dependent helicases 
proteins - 


BL00039D 21.67 9.586e- ) 
10 116-162 


1490 


BL00166 


Enoyl-CoA 

hydratase/isomerase 
proteins . 


BL00166D 22.87 2.607e- 
24 190-226 BI1OOI66C 
18.93 5.500e-14 140- 
167 BL00166B 16.92 
9.357e-ll 93-115 


1491 


BL00452 


Guanylate cyclases 
proteins . 


BL00452D 28.59 3.700e- 
31 63-106 DL00452E 
11.92 3.045e-13 115- 
131 


1492 


PR00019 


LEUCINE-RICH REPEAT 
SIGNATURE 


PR00019A 11.19 3.667e- 
09 532-546 


1497 


BL00107 


Protein kinases ATP- 
binding region proteins . 


BL00107B 13.31 1 . OOOe- 
11 384-400 BL00107A 
18.39 5.345e-ll 322- 
353 


1500 


PF00876 


Ogre family. 


PF00876E 7.99 1.947e- 
10 107-117 


1502 


BL00027 


• Home obox ■ domai n 
proteins . 


BL00027 26.43 4.789e- 
24 112-155 


1503 


BL00027 


• Homeobox • domain 
proteins . 


BL00027 26.43 4.789e- 
24 112-155 


1505 


BL01177 

• 


Anaphy la toxin domain 
proteins . 


BL01177E 20.64 5.800e- 
24 448-475 BL01177C 
17.39 5.333e-19 402- 
421 BL01177B 13.61 
7.840e-16 155-171 
BL01177D 17.50 1.900e- 
15 427-445 


1506 


BL0 0972 


ubiquitin carboxyl- 
terminal hydrolases 
family 2 proteins. 


BL00972D 22. 5S 5.500e- 
14 311-336 BL00972A 
11.93 7.429e-14 48-66 
BL00972B 20.72 8.759e- 
10 341-363 


1512 


BL00S23 


Sulf atases proteins . 


BL00523E 19.27 4.536e- 
22 76-106 BL00523D 
9.89 1.563e-ll 40-52 
BL00523F 10.85 4.162e- 
09 159-170 BL00523G 
9.46 5.333e-09 256-266 


1516 


BL00914 


Syntaxin / epimorphin 
family proteins. 


BL00914 24.91 7.045e- 
14 168-218 


1518 


BL00600 


Aminotransferases class- 
Ill pyridoxal -phosphate 
attachment si . 


BL00600A 17.98 6.143e- 
19 98-122 BL00600E 
16.43 1.771e-17 302- 
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SEQ ID NO: | ACCESSION 

NO. 



331 BLO06C0G 12.43 
9.62Se-l7 377-396 
BL00600B 15.60 5.091e- 
15 160-186 BL00600C 
16.18 6.04Ce-l2 190- 
206 BL006G0F 8.77 
1.000e-ll 343-356 
BLO06O0D 8.71 l.OOOe- 
10 281-295 



1523 



PD00930 



PROTEIN GTPASE DOMAIN 
ACTIVATION . 



1528 



PR00320 



G- PROTEIN BETA V7D-40 
REPEAT SIGNATURE 



PR00320B 12.19 4.774e- 
11 192-207 PR00320B 
12.19 8.839e-ll 272- 
287 PR00320B 12.19 
9-743e-10 106-121 
PR00320A 16.74 1.878e« 
09 192-207 PR00320A 
16.74 2.317e-09 106- 
121 PR0D320A 16.74 
8.683e-09 272-287 
PRO0320C 13-01 8.800e 
09 106-121 



153 8 



DM01970 



0 kw ZK632.12 YDR313C 
ENDOSOMAL III. 



1539 



PF00781 



Di a cyl glycerol kinase 
catalytic domain 
proteins (presumed) 



PF00781D 11.11 7.593e 
10 103-127 



1540 



PR00965 



OCULAR ALBINISM TYPE 1 
PROTEIN SIGNATURE 



BL01013 
PD02699 



Oxysterol-b i nding 
protein family proteins 
PROTEIN DNA- BINDING 
BINDING DNA. 



1544 



PR00049 



WILM'S TUMOUR PROTEIN 
SIGNATURE 



PR00965H 10.73 1.231e- 
29 312-334 PR00965E 
12.93 5.846e-29 172- 
195 PR00965F 5.98 
1.123e-28 209-231 
PR0096SC 15.04 1 . OOOe- 
27 131-151 PR00965D 
5.84 1.000e-27 150-170 
PR00965G 8.52 2.440e- 
27 258-279 PR00965B 
4.80 8.650e-26 88-109 
PR00965A 12.52 1 . OOOe- 
25 35-55 PR00965I 
3.91 6.442e-25 385-406 
BL01013D 26.81 9-7l9e- 
17 163-207 
PD02699C 24 . 84 1 . OOOe- 
40 599-646 PD02699A 
8.91 2.286e-34 219-248 
PD02699B 18.28 6.143e- 
21 485-509 



PR00049D 0.00 7.857e- 
10 182-197 PR00049D 
0.00 7.102e-09 67-82 



1547 



BL00951 



ER lumen protein 
retaining receptor 
proteins . 



BL00951C 19.35 ; 
40 93-142 BL00951D 
13.94 8.714e-40 142- 
177 EL00951A 15.10 
1.000e-38 2-38 . 
BL00951B 14.23 6.250e< 
33 38-69 



1548 



BL00536 



Ubigui tin- activating 
enzyme proteins. 



BL00536F 13.65 8-920e 
30 279-318 BL00536D 
22.91 5.737e-24 21-65 
BL00536E 16.94 4.696e 
18 248-279 



1549 



PR00139 



" AS PARAG INASE j GLUTAMINASE 
FAMILY SIGNATURE 



PR00139C 11.72 9.679e 
09 550-569 



1553 



PR00049 



WILM'S TUMOUR PROTEIN 
SIGNATURE 
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ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 


1556 


BL00061 


Short -chaia 

dehydrogenase a /reductase 
s family proteins. 


BL00061B 25.79 6.276e- 
13 67-105 


1557 


BL01228 


Hypothetical cof family 
proteins . 


BL01228D 17.44 8.l05e- 
12 107-132 


1558 


BL01228 


Hypothetical cof family 
proteins . 


BL01228D 17.44 8.105e- 
12 107-132 


1559 


BL01228 


Hypothetical cof family- 
proteins . 


BL01228D 17.44 S.lOSe- 
12 107-132 


1562 


BL00522 


DNA polymerase family X 
proteins . 

• 


BL00522C 11.90 6.600e- 
18 412-436 BL00522B 
27.30 1.738e-16 364- 
410 BL00522A 25.52 
6.000e-l6 279-326 
BL00522E 19.63 6 . 123e- 
14 502-532 BL00522F 
14.90 2.385e-13 551- 
575 


1563 


PF006S1 


BTB (also ]cncwn as BR- 
C/Ttk) domain proteins. 


PF00651 15.00 1.94 7e- 
11 46-59 


1564 


BL00299 


Ubiquitin domain 
proteins . 


BL00299 28.84 2.823e- 
10 324-376 


1566 


BL01013 


Oxysterol - binding 
protein family proteins. 


BL01013D 26.81 8 . S94e- 
17 184-228 BL01013C 
9.97 4 .906e-12 14-24 


1567 


BL00678 


Trp-Asp (WD) repeat 
proteins proteins . 


BL00678 9.67 3 ,400e-10 
378-389 BLO0678 9-67 
5.800e-l0 418-429 
BL00678 9.67 8.800e-10 
295-306 


1570 


BL00479 


Phorbol esters } 
diacylglycerol binding 
domain proteins . 


BL00479B 12.57 5.235e- 
17 297-313 BL00479A 
19.86 6.625e-15 271- 
294 BL00479A 19.86 
2.667C-14 147-170 
BL00479B 12.57 6.294e- 
12 173-189 


1576 


PR00665 


OXYTOCIN RECEPTOR 
SIGNATURE 

• 


PR00665G 12.36 4.673e- 
24 364-384 PR00665D 
9.93 1.200e-22 138-155 
PR00665F 11.73 4.000e- 
22 337-354 PR00665C 
5.89 1.000e-20 65-80 
PR00665B 5.29 4.337e- 
19 24-39 PR00665E 
5.60 2.929e-l5 246-260 
PR0066SA 5.99 5.622e- 
15 11-25 


1577 


DMO0O99 


4 kw A55R REDUCTASE 
TERMINAL 

DIHYDROPTERIDINE . 


DM00099B 14.73 9.308e- 
10 127-137 


1579 


BL00S24 


Somatomedin B domain 
proteins . 


BL00524A 9.65 6.776e- 
14 52-73 | 


1580 


PD02894 

• 


HYDROLASE N4- PRECURSOR 
PROTEIN SIGNAL BE. 


PD02894B 13.93 6.959e- 
16 182-215 PD02894A 
21.96 2.125e-10 57-103 


1581 


BL00411 


Kinesin motor domain 
proteins . 


BL00411C 15.04 5.292e- 

15.66 4.441e-ll 245- 
276 


1582 


PR00604 


CLASS IA AND IB 
CYTOCHROME C SIGNATURE 


PR00604A 11.13 2.440e- 
09 79-87 


1584 


PF006S1 


BTB (also known as BR- 
C/Ttk) domain proteins. 


PF00651 15.00 1.000c- 
10 225-238 


1585 


DM01551 


kw OSTEOINDUCTIVE YOPM 
MEMBRANE OUTER. 


DM01551C 14.62 9.455e-_ 
11 125-145 


1586 


DM01354 


kw TRANSCRIPTASE REVERSE 
II ORF2. 


DM01354S 11.61 7.750e- 
09 474-495 
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SEQ ID NO: 



1587 



ACCESSION 
NO. 



PRO0072 



DESCRIPTION 



MALIC ENZYME SIGNATURE 



RESULTS 1 



PR00O72B 13.77 7 . 9S5e- 
33 180-210 PR00072A 
12.75 6.040e-25 120- 
145 PR00072C 11.42 
2.286e-24 216-239 
PR00072D 10.77 3.400e- 
22 276-295 PR00072E 
10.54. 1.360e-19 301- 
318 PR00072G 10.45 
5.304e-l9 433-450 
PR00072F 8.87 5.93 5e- 
15 332-349 



1589 



BL00191 



1590 



DM01970 



Cytochrome b5 family, 
heme -binding domain 
proteins . 



BL00191H 15.64 1.537e- 
22 51-113 BL00191K 
17.38 9.027e-12 398- 
442 



0 kw ZK632.12 YDR313C 
ENDOSOMAL III . 



DM01970B 8.60 7.716e- 
13 211-224 DM01970B 
8.60 2.l57e-12 94-107 



1591 



DM00517 



5 kw NUCLEAR 60.7 NUP1 
CHROMOSOME . 



DM00517B 10.96 6.625e- 
16 117S-1193 DM00517A 
8.21 1.000e-ll 1015- 
1026 



1592 



BL00037 



1595 



BL00028 



Myb DNA- binding domain 
proteins repeat proteins 
proteins . 



Zinc finger, C2H2 type, 
domain proteins . 



BL00037B 15.92 3.250e- 
27 116-142 BL00037A 
16.68 2.500e-24 83-107 
BL00037A 16.68 3.250e- 
12 31-55 BL00037B 
15.92 3.526e-ll 64-90 
BL00037C 16.86 9.654e- 
10 146-164 



BL00028 16.07 l_514e- 
09 110-127 



1598 



PF00628 



PHD -finger. 



PF00628 15.84 3.250e- 
11 1667-1682 



1599 



PRO 00 14 



FIBRONECTIN TYPE III 
REPEAT SIGNATURE 



PR0D014D 12.04 5.500e- 
09 980-995 



1600 



BL00518 



1602 



BL00412 



Zinc finger, C3HC4 type 
(RING finger) , proteins 



BL00518 12.23 6.571e- 
10 30-39 



Neuromodulin (GAP-43) 
proteins. 



BL00412D 16.54 5.402s- 
10 136-187 



1605 



PF00651 



1607 



BL00252 



1610 



DM00215 



1611 



BL00904 



1612 



PF00168" 



1613 



BL00412 



BTB (also known as BR- 
C/Ttk) domain proteins 



PF00651 15.00 3.57le- 
10 44-57 



Interferon alpha, beta 
and delta family 
pr ot e ins . 



BL00252A 18.49 6.657e- 
23 20-57 BL00252B 
19.78 9.125e-16 58-109 



PROLINE-RICH PROTEIN 3 



DM00215 19.43 l.OOOe- 
08 61-94 



Protein 

prenyl transferases alpha 
subunit repeat proteins 
proteins . 



BL00904C 8.98 7.3S3e- 
10 91-125 BL00904D 
1.47 6.018e-09 127-168 



C2 domain proteins. 



PF0016BC 27.49 3-.250e 
09 365-391 



Neuromodulin (GAP-43) 
proteins - 



BL00412D 16.54 6.051e 
09 932-983 BL00412D 
16.54 7.153e-09 933- 
984 



1614 



BL00559 



1615 



PD01427 



Eukaryotic molybdopterin 
oxidoreductases 
proteins . 



BL00559I 13.63 3.531e- 
25 54-83 BL00559K 
13.17 2.957e-18 197- 
224 BL0C559J 19-63 
6.870e-16 124-176 
BL00559L 13.60 9.000e- 
16 266-284 



TRANSFERASE 
METHYLTRANS FERASE Bl 



PD01427B 22.45 3.025e 
22 500-541 PD01427A 
19.94 8.773e-18 439- 
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RESULTS* 








472 


1616 


BL00115 


EuXaryotic RNA 
polymerase II 
heptapept ide repeat 
proteins. 


BL00115Z 3.12 7.485e- 
09 152-201 BL0011SZ 
3.12 9.603e-09 145-194 


1617 


BL00303 


S- 100/ICaBP type calcium 
binding protein. 


BLOO303B 26.15 7.750e- 
32 51-88 BL00303A 
21.77 1.947e-31 4-41 


1618 


BL012S4 


Petuin family proteins. 


BL012S4F 10.02 8.754e- 
09 137-147 


1619 


PD01888 


PEPTIDE REDUCTASE 
PROTEIN METHI. 

1 


PD01888B 25.10 l.OOOe- 
40 47-97 PD01888C 
21.56 7.000e-30 125- 
155 PD01888A 12.84 
8.800e-l5 7-23 


1621 


PR00239 


MOLL US CAN RHODOPSIN C- 
" TERMINAL TAIL SIGNATURE 


PR00239E 1.58 3.455e- 
09 692-704 PR00239E 
1.58 4.580e-09 697-709 
PR00239E 1.58 4.580e- 
09 702-714 PRO0239E 
1.58 5.193e-09 703-715 


1622 


PR00860 


VERTEBRATE 

METALLOTHIONEIN 

SIGNATURE 


PR00860B 7.04 1.900e- 
18 27-41 PR00860C 
9.61 1.474e-14 41-51 
PR00860A 5.46 1.720e- 
14 5-18 


1624 


PRO 0734 


MITOCHONDRIAL BROWN FAT 
UNCOUPLING PROTEIN 
SIGNATURE 


PR00784D 15.86 8 . 027e- 
11 77-95 


1626 


BL00325 


Actin-depolymerizing 
proteins . 


BL00325B 21.66 l.OOOe- 
40 93-139 BL00325A 
24.83 6.786e-23 61-93 


1631 


BL0D064 


L- lactate dehydrogenase 
proteins . 


BL00064B 23.57 l.OOOe- 
40 82-130 3LOO064C 
17.28 l.OOOe-40 137- 
182 BL00064E 27.20 
l.OOOe-40 223-275 
BL00064F 25.14 7.882e- 
36 286-331 BL00064A 
21.16 1.000e-33 22-60 
BL00064D 14.19 6.500e- 
31 182-212 


1632 


PRO0063 


RIBOSOMAL PROTEIN L27 
SIGNATURE 


PR00063B 15.24 9.700a- 
11 59-84 PR00063A 
11.71 1.614e-09 34-59 


1634 


PR00239 


MOLLUSCAN RHODOPSIN C- 
TERMINAL TAIL SIGNATURE 


PR00239D 0.00 1.105e- 
11 36-49 PR00239C 
3.51 2.538e-09 37-45 


1636 


BL0121D 


Caveolins proteins. 


BL01210B 13.92 9.53le- 
10 133-183 


1637 


BL00982 


Bacterial- type phytoene 
dehydrogenase proteins. 


BL00982A 18.41 5.388e- 
11 11-43 


1639 


BL01183 


ubiE/COQ5 

methyl transferase family 
proteins. 


BL01183B 21.31 8.144e- 
12 132-177 


1640 


PR00015 


GRAM- POSITIVE COCCUS 
SURFACE PROTEIN ANCHOR 
SIGNATURE 


PR00015B 9.84 8.468e- 
10 128-149 


1641 


PR00320 


G- PROTEIN BETA WD- 40 
REPEAT SIGNATURE 

• 


PR00320B 12.19 5. 93 Se- 
ll 364-379 PR00320A 
16.74 7.828e-ll 364- 
379 PR00320C 13.01 
2.800e-10 279-294 
PR00320C 13.01 2.800e- 
10 364-379 PR00320B 
12.19 5.114e-10 279- 
294 PR00320A 16.74 
1.659e-09 279-294 
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o£>y XX* . 


NO. 


DESCRI PTION 


RESULTS* 








PR00320A 16.74 2.098e- 
09 229-244 


1642 


PF00023 


Ank repeat proteins . 


PF00023A 16.03 6.464e T 
09 114-130 


1643 


PR00169 


POTASSIUM CHANNEL 
SIGNATURE 


PR00169A 16.77 1.806e- 
11 74-94 


1644 


BL00678 


Trp-Asp (WD) repeat 
oroteins Droteins 


BL00678 9.67 2.200e-10 
109-120 BL00678 9.67 
5.737e-09 528-539 


1645 


BIj01108 


Ribosomal protein L>24 
proteins . 


BL01108A 20.33 7.366e- 
17 56-89 


164 6 


PR003B0 


KINESIN HEAVY CHAIN 
SIGNATURE 


PR00380A 14.18 9.270e- 
21 103-125 PR00380D 
9.93 6. 308e-18 '386-408 
PRO038OC 13.18 7.923e- 
16 332-351 PR00380B 

310 


1647 


DM01242 


3 THREONINE — TRNA 
LIGASE . 


37 340-381 DM01242E 

•>1 A A C A "71 «*_ ^ 1 4fi"J_ 
ZJ.UU D . U / J-C — J X H \j j — 

505 DM01242D 23.29 
3.925e-30 420-463 j 
DM01242B 23.57 8 . 054e- 
10 "Jfi^-^^ DMDX242P 

10.61 7.618e-14 526- 


1649 


PD00126 


PROTEIN REPEAT DOMAIN 


PD00126A 22.53 5.500e- 


1651 


BB01160 


Kinesin light chain 
repeac procexns . 


BL01160B 19.54 6.720e- 


1652 


BL00933 


FGGY family of 
carbohydrate kinases 
proteins . 

% 


BL00933A 17.50 4.673e- 
12 11-35 BL00933E 

472 


1653 


BI*00795 


invoiucrin proteins . 


nT.AmQ^r' "i 7 nfi 2 98Re*- 
10 70-115 


1654 


B 1*00982 


odcceridi — cype pnycoene 
dehydrogenase proteins. 


nT.009fl?A 18 41 7.750e- 
17 302-334 


1655 


BL009B2 


Bacterial -type phytoene 
dehydrogenase proteins . 


BL00982A 18.41 7.75De- 
17 282-314 


1656 


BL00741 


uucimriG - nucisouicie 
dissociation stimulators . 
PDC?^>A fantilvr sicm. 


Hi.nrndiR 14 27 1 39le- 
16 607-630 


1657 


PR00449 


TRANSFORMING PROTEIN P21 
RAS SIGNATURE 

» * 


PR00449A 13.20 7.93Be- 
11 114-136 


1658 


PR00910 


LUTEOVIRUS 0RF6 PROTEIN 
SIGNATURE 


PR00910A 2.51 8.889e- 
10 442-455 






terminal hydrolases 
family 2 proteins . 


BL00972D 22.55 4.14 0e- 
12 376-401 BL00972E 
20.72 5.629e-09 446- 
468 


1660 


BL00406 


Actins proteins. 


BL00406D 12.58 8.767e- 
15 188-243 


1661 


PR00105 


CYTOSINB- SPECIFIC DNA 
METHYL/TRANS FERAS E 
SIGNATURE 


PR00105A 10.36 4.900e- 
13 1140-1157 PR00105B 
12.32 2.800e-12 1259- 
1274 PR00105C 10.86 
1.000e-10 1305-1319 


1662 


BL0023O 


Pancreatic trypsin 
inhibitor (Kunitz) 
family proteins - 


BL00280 24.61 3.172e- 
33 3119-3163 


1663 


PR00319 


BETA G- PROTEIN 
( TRANS DUC IN) SIGNATURE 


PR00319D 11.64 6.625e- 
23 107-125 PR00319C 
13.41 5.714e-20 89-105 
PR00319A 15.27 5.286e- 
19 51-68 PR00319B 
11.47 8.200e-19 70-85 
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SEQ ID NO: 


ACCESSION 
NO 


DESCRIPTION 


RESULTS* 


1664 


BL00018 


EF-hand calcium- binding 
domain proteins - 


BL00018 7.41 S.050e-10 
489-502 


1667 

•LOW' 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL - 
BINDING ND. 


PD01066 19.43 8.500e- 
38 7-46 


1669 


BL01153 


NOLl/NOP2/sun family 
proteins . 


BL01153D 19.69 1.188e- 
17 11S-141 BL011S3C 
13.67 8.977e-15 66-80 
BL01153B 20.52 1.885e- 
10 13-37 


1671 


PR00678 


PI3 KINASE P85 
REGULATORY SUBUNIT 
SIGNATURE 


PR0067BH 9.13 3.1O0e- 
10 1146-1169 


1672 


BL00598 


Chromo domain proteins. 


BL00598 14.45 8.500e- 
20 27-49 


1673 


PR00326 


GTP1/OBG GTP- BINDING 
PROTEIN FAMILY SIGNATURE 


PR00326A 8.75 B.329e- 
09 686-707 


1674 


PR00049 


WILM'S TUMOUR PROTEIN 
SIGNATURE 


PR00049D O.O0 7.580e- 
11 343-358 PRD0049D 
0.00 1.286e-10 342-357 


1676 


PR00747 

1 

i 
> 

i 

> 


GLYCOSYL HYDROLASE 
FAMILY 47 SIGNATURE 

* 


PR0074 7H 12.76 8.636e- 
19 427-448 PR00747G 
14.50 2.286e-18 368- 

7.500e-l8 112-131 
PR0074 7A 14.05 4.600e- 
17 42-63 PR00747D 
15.23 8.7S9e-l7 163- 
183 PR00747E 15.13 
8.244e-15 254-272 
PR00747B 7.65 5.355e- 
13 75-90 PR00747F 
13.56 8.714e-10 311- 

Tip 


1677 


PR00747 


GLYCOSYL HYDROLASE 
FAMILY 47 SIGNATURE 

• 


PR00747H 12.76 8.636e- 
19 309-330 PR00747G 

_L t • J U .L. . i. D V a JL O L J u 

275 PR00747C 12.06 
7.500e-l8 112-131 
PR00747A 14.05 £.60Oe- 
17 42-63 PR00747B 
7.65 5.35Se-13 75-90 
PR00747F 13.56 8 . 714e- 
10 193-210 


1680 


BL00678 


Trp-Asp (WD) repeat 
proteins proteins . 


BL00678 9.67 4.600e-10 
406-417 BL00678 9.67 
6.684e-09 320-331 


1681 


BL00676 


Trp-Asp (WD) repeat 
proteins proteins. 


BL0067B 9.67 4.6C0e-10 
329-340 BL00678 9.67 
6.684e-09 243-254 


1683 


PR00326 


GTP1/OBG GTP- BINDING 
PROTEIN FAMILY SIGNATURE 


PR00326A 8.75 1.346e- 
13 389-410 


1685 


PRO 064 6 


RDC1 ORPHAN RECEPTOR 
SIGNATURE 


PR00646H 6.32 4.18Be- 
09 755-771 


1690 


BL01160 


Kinesin light chain 
repeat proteins. 


BL01160B 19.54 6.644e- 
09 75-129 


1691 


PR00456 


RIBOSOMAL PROTEIN* P2 
SIGNATURE 


PR00456E 3.06 7.281e- 
10 418-433 PR00456E 
3.06 7.281e-10 419-434 
PR00456E 3.06 8.125e- 
10 420-435 


| 1692 


PR00456 


RIBOSOMAL PROTEIN P2 
SIGNATURE 


PR00456E 3.06 7.28ie- 
10 487-502 PR00456E 
3.06 7.281e-10 488-503 
PR00456E 3.06 8.125e- 
10 489-504 


1693 


BL00674 


AAA -protein family 
proteins . 


BL00674C 22.60 8.043e- 
24 274-317 BL00674B 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








4.46 4.000e-23 241-263 
BL00674D 23.41 8.560e- 
10 Hfl.'lflQ RT.00674P 

15.24 1.720e-15 414- 
434 


1697 


PR00409 


PHTHALATE DIOXYGENASE 
REDUCTASE FAMILY 
SIGNATURE 


PR00409F 12.70 4.388e- 
10 427-447 


1696 


PRO0466 


CYTOCHROME B-24S HEAVY 
CHAIN SIGNATURE 


PR00466C 10.17 3.443e- 
13 187-208 PR00466B 
5.03 5.500e-ll 162-186 

09 498-517 


1699 


BL00028 


Zinc finger, C2H2 type, 
domain proteins. 


BL00028 16.07 9-217e- 
12 283-300 BL00028 

I /• f\ —J T "J C O c» 11 occ. 
lb . U / J./D^B-i-X <s35" 

272 BL00028 16.07 
5.154e-ll 171-188 

II 227-244 BL00028 
16.07 1.600e-10 199- 
21b 


1700 


BL01019 


ADP-ribosylation factors 
family proteins. 


BL01019A 13.20 3.348e- 
15 62-102 BL01019B 
19.49 4.000e-l5 107- 
162 


1703 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 
BINDING NU. 


PD01O66 19.43 2.4B4e- 
12 200-239 


1707 


PRO 01 09 

• 


TYROSINE KINASE 
CATALYTIC DOMAIN 
SIGNATURE 


PR00109B 12.27 4.558e- 
14 134-153 


1710 


PRO0019 


LEUCINE- RICH REPEAT 
SIGNATURE 


PR00019A 11.19 2.56be- 

10 116-130 PR00019B 

11 .36 4 . oOOe-U7 JLAo — 
127 PRO0O19B 11.36 

/ . iz Uc ~ U j Z U 1 * ^; -L o 


1711 


BL01159 


WW/rsp5/WWP domain 
proteins . 


BL01159 13.85 6 . 523e- 

13.85 5.408e-10 613- 
628 


1712 


PF00023 


auk repeat procems . 


DPonn?3A i_G 03 7.000e- 
10 187-203 


1713 


PFO0642 


zinc ringer u-xo- ^— 3t»-^»— 
x3-H type (and similar) . 


PPOOS42 TT 59 9 . 550e- 
11 230-241 


1714 


PF00642 


Zinc finger C-x8-C-x5-C- 

U ►* "ft rr^. / r» -rtrl O ^ Fit 1 *1 O 1r» 1 

X 3 ft Lypc \dllvl oXiuxiai 7 • 


PF00642 11.59 9.550e- 
11 230-241 


| 1715 


BL01115 


GTP- binding nuclear 


BL01115A 10.22 7.129e- 
09 7-51 


1718 


BL003S3 


HMG1/2 proteins. 


BL00353C 14.83 6.018e- 
10 136-183 BL00353B 
11.47 8.866e-09 86-136 


4 -ft /-v 

1719 


BL004 12 


proteins . 


PT.OQ412D 16 54 5.408e- 
09 432-483 


1721 


BL0003B 


Myc-type, 'helix- loop- 
nelix 1 aimerizacion 
domain proteins. 


BL00038B 16.97 8.448e- 
t*> 7Q-1O0 BL00038A 
13.61 4.000e-ll 52-68 




rl/Uujo / 


PROTEIN RNA- BINDING RNA 
REPEAT HYD. 


PD00567C 9.17 8.500e- 
09 418-428 


1724 


BL01279 


Protein-L- 
isoaspartate {D- 
aapartate) O- 
methyl transferase signa. 


BL01279A 24.27 5.663c- 
12 233-281 


1728 


BL00018 


EF-hand calciun- binding 
domain proteins. 


BL00018 7.41 2.059e-ll 
73-86 * BL00018 7.41 
4.l76e-ll 157-170 


1730 


BL00594 


Aromatic amino acids 
permeases proteins. 


BL00594A 16.75 l-089e- 
09 17-61 
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SEQ ID NO: 



1731 



1732 



ACCESSION 
NO. 



BL01160 



BL01160 



DESCRIPTION 



Kinesin light chain 
repeat proteins . 



Kinesin light chain 
repeat proteins . 



RESULTS* 



BL01160B 19.54 9.676e 
10 296-350 



BL01160B 19.54 9.676e 
10 316-370 



1733 



PF00850 



1734 



BL003 54 



1735 



DM00179 



1743 



PRO 04 4 9 



1744 



PR00449 



1745 



BL00720 



Histone deacetylase 
family . 



PF00850F 15.70 4.349e- 
22 246-279 PF00850D 
14.76 6.8S0e-20 177- 
201 PF00850E 8.88 
8.69le-18 209-235 
PF00850G 22.75 4.098e- 
14 281-323 



HMG-I and HMG-Y DNA- 
binding domain proteins 
(Ahook) . 



BL00354C 6.61 5.932e- 
09 292-307 



w KINASE ALPHA ADHESION 
T- CELL . 



DM00179 13.97 5.263e- 
10 492-502 



TRANSFORMING PROTEIN P21 
RAS SIGNATURE 



PR00449A 13.20 1.188e- 
11 5-27 PR00449D 
10.79 2.241e-10 109- 
123 PR00449E 13.50 
9.289e-10 144-157 



TRANSFORMING PROTEIN P21 
RAS SIGNATURE 



PR00449A 13.20 1.188e- 
11 S-27 PR00449D 
10.79 2.24ie-10 109- 
123 PR00449E 13.50 
9.289e-10 144-167 



Guanine - nucleot ide 
dissociation stimulators 
CDC25 family sign. 



BL00720B 16.57 8.297e 
15 136-160 



1746 



PR000 81 



1747 



BL00439 



1749 



PR00819 



1751 



PD00066 



1753 



BL01013 



1754 



BL00790 



1756 



PD01066 



1758 



DM004 06 



1762 



PD02929 



1765 



PR00326 



1775 



PF00023 



1776 



BL00942 



1777 



DM00215 



GLUCOSE/RIBITOL 
DEHYDROGENASE FAMILY 
SIGNATURE 



PR00081B 10.38 6.727e- 
11 45-57 PR00081E 
17.54 3.935e-10 150- 
168 



Acyl transferases 
ChoActase / COT / CPT 
family proteins 



BL00439H 18.24 8.435e- 
14 65-91 BL00439G 
13.40 2.895e-12 3-14 



CBXX/CFQX SUPERFAMILY 
SIGNATURE 



PR00819B 10.83 7.158e- 
11 4-20 



PROTEIN ZINC- FINGER 
METAL- BINDI. 



PD00066 13.92 3.400e- 
14 33-46 PD00066 
13.92 1.000e-13 89-102 
PD00066 13.92 7.000e- 
13 61-74 PDO0066 
13.92 6.571e-12 117- 
130 



Oxysterol -binding 
pro tein family proteins. 
Receptor tyrosine kinase 
class v proteins. 



BL01013D 26.81 6 . 516e- 
18 33-77 



BL00790I 20.01 2.393e 
09 490-521 BL00790I 
20.01 2.821e-09 60-91 
BL00790I 20.01 6.357e 
09 287-318 



PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 
BINDING NU. 



PD01066 19.43 9.7S0e- 
35 10-49 



GLIADIN . 



DM00406 7.73 7.600e-09 
653-666 



ADHESION GLYCOPROTEIN 
PRECURSOR I- 



PD02929A 28.27 4 . 529e- 
09 224-278 



GTP1/OBG GTP-BINDING 
PROTEIN FAMILY SIGNATURE 



PR00326A 8.75 5.950e- 
11 146-167 



Ank repeat proteins 



PF00023A 16-03 3.077e- 
14 523-539 



glpT family or 
transporters proteins. 



BL00942F 15.07 4.343e- 
10 371-389 BL00942B 
20.36 8.040e-09 94-137 



PROLINE -RICH PROTEIN 3 



DM00215 19-43 2.373e- 
09 279-312 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 


1778 


BU00084 


Copper type II, 

a s corbate- deoendent 

monooxygenases proteins. 


BL00084D 25.11 3.700e- 
20 169-224 BL00084B 
24.26 8.134e-l6 10-58 
BL00084C 27.71 8.412e- 
11 107-158 


1779 


BL01013 


Oxysterol -binding 
protein family proteins. 


BL01013D 26.81 3.758e- 
18 611-655 BL01013A 
25.14 2.891e-15 344- 
380 BL01013C 9.97 
6.308e-13 435-445 
BL01013B 11.33 3.717e- 
12 409-420 


1783 


BL00741 


Guanine -nucleotide 
dissociation stimulators 
CDC24 family sign. 


BL00741B 14.27 8.138e- 
13 492-515 


1784 


BL.0O741 


Guanine -nucl eot ide 
dissociation stimulators 
CDC24 family sign. 


BL00741B 14.27 8.138e- 
13 492-515 



* results include in order: accession number subtype; raw score; p-value; postion of 
signature in amino acid sequence. 
TRADOCS: 1416223.1 (%CRJ0l LDOC) 
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TABLE 4 



S5Q ID 
wo - 




1 tipcrn T PTTDN 


L> — VdlUC 


DP AM 
SCORE 




-i-g 


T mmunoo 1 obul i n H nms *i r> 


2 le-3 2 


109 5 


■a 
-j 


t-\ If" 4 T> ^» 43 ^ 


"CS l a rvrt t"_ i nrnt" pin V i 
domain 


1 3e- 2 9 


110 . 7 


4 


zf -C2H2 


Zinc finder", C2H2 tvoe 


1. 6e-21 


84.9 


5 


fn3 


Fibronectin type III domain 


0 


1097 .1 


o 


X II l 


PibTonectin tvoe ITT domain 


o 


1035 . 0 


7 


fn3 


Fibronectin type III domain 


0 


1090 .4 


o 
O 




riuxuiic^tiu type xii uuutaxii 




1 097 1 

J* k/ J * - -4. 


9 


TBC 


TBC domain 


4e-40 


146.7 


T ft 

xu 




LyCOCXlTOIDc Jtr^DV 






12 


ank 


Ank repeat 


6e-20 


79.7 


14 




Immunoglobulin domain 


i . ve-OD 


ZZ . I 


IS 


zf -MYND 


MYND t anger 


l . 3e-Ub 


3 5.4 


lb 


Z f -MYND 


MYND ranger 


1 . 3e-0o 


Or A 

3b .4 


17 


zf-C2H2 


Zinc finger, C2H2 type 


1.7e-99 


343.9 


18 


CAP_GL»Y 


CAP-Gly domain 


1.2e-25 


98 . 7 


20 


IMPDH_C 


IMP dehydrogenase / GMP 
reductase C terminus 


1. 6e-119 


410.5 


21 


IMPDH_C 


IMP dehydrogenase / GMP 
reductase C terminus 


4 .3e-102 


3S2 .6 


22 


pkinase 


Eukaryotic protein kinase 
domain 


2 . 4e-79 


277 .O 


23 


pkinase 


Eukaryotic protein kinase 

• * 

domain 


8 .4e-74 


258 .6 


25 


RNA_jpol_A 


RNA polymerase alpha subunit 


0 


1077 . 7 


26 


Clq 


Clq domain 


1 ,9e-10 


44 . 4 


27 


Ribosomal_L2 
3 


Ribosomal protein 1*23 


7 . 8e-32 


111 . 2 


28 


Ribosomal_L2 

-> 

3 


Rxbosomal protean L23 


le-29 


104 - 2 


30 


zf-A20 


A20-like zinc finger 


1 .5e-10 


48.5 


31 


zf -A20 


A20-like zinc finger 


1 . 5e-10 


4 8.5 


32 


FMN_dh 


FMN- dependent dehydrogenase 


5.4e-179 


608.1 


34 


PID 


Phospho tyrosine interaction 
domain (PTB/PID) 


3 . 8e-59 


209 . 3 


3 5 


ig 


Immunoglobulin domain 


1 . 4e-l3 


4 O . b 


3 6 


ig 


1 mmunog 1 obu 1 in doma in 


l * 4e-i-i 


4 o . o 


40 


kinesin 


Kinesin motor domain 


D . / G /O 




44 


Ets 


Ets-domain 


1.4e-56 


182.1 


45 


Eta 


Ets- domain 


l . 4e- do 




46 


IiRR 


Leucine Rich Repeat 


1 . 7e-i3 


58.3 


4 8 


zf -C2H2 


Zinc ringer, C2H2 type 




ceo a 


A d 

4 y 


x r AM 


Immunorecepcor tyrosine -x>a sea 

clCL IVuLIOH UlOU 


x . 4 e- 








ujjiquiLin caruuAyi. Lcimiiidx 

hydrolase family 


JL * 1C 


102 0 






T TW S rm i f in <**ia rhnwl • t" prmi ti J* 1 

UUIUUJLLIU UaiJJUAYl !■ IUA mm A 
11 v va<x> jl cio a. a.i i i«ju ^ y 


1 le-26 


102 . 0 




■i-k a a 


Dae f nmi 1 \/ 


8 . 5e- 45 


162 . 3 


53 


PRK 


P hosnhorihu 1 r>lc i na s 


2 . le-65 


230 .7 


54 


fnv/K DMA- 

binding 


Mvh-1 ilee T5NA — hindi ncj dnmain 


0 . 096 


15.2 


55 


voltage_CI*C 


Voltage gated chloride channels 


3.3e-186 


631.9 


56 


sugar_tr 


Sugar (and other) transporter ( 


0.00015 


-64 .3 


57 


TBC 


TBC domain 


2.2e-37 


137.6 


58 


ank 


Ank repeat 


5.9e-25 


96 .3 


59 


ank 


Ank repeat 


5.9e-25 


96 .3 


67 


PMP22_Claudi 
n 


PMP-22/EMP/MP20/Claudin family 


7.9e-49 


175.6 


68 


C2 


C2 domain 


7 .9e-54 


192 .2 


69 


C2 


C2 domain 


2 ,3e-54 


194 .0 


70 


Kelch 


Kelch motif 


9.4e-99 


341.5 


72 




Immunoglobulin domain 


8 .2e-28 


94 .7 


73 


pkinase 


Eukaryotic protein kinase 


8e-69 


242 .1 
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SEQ ID 

NO: 


PPAM NAME 


DESCRIPTION 


p- value 


PFAM 
SCORE 






domain 






74 


pkinase 


Eukaryotic protein kinase 
domain 


2.8e-38 


140.6 


76 


z£- 

C4_Topoisom 


Topoisomerase DNA binding C4 
zinc ring 


5.4e-54 


192.8 


83 


Peptidase_S9 


Prolyl oligopepcidase family 


4 .3e-10 


36.8 


o4 


tn3 


riDronectin type III domain 


4 . le-51 


183 . 2 


ob 


Sxi-Z 


Src homology domain 2 


3 . le-22 


67 . 7 


o o 


~m 


Immunoglobulin domain 


0 . 0091 


14 . 0 






t_3 ■ 1 _p3 _P^k«V» H_W _y — ^L_ ___ ^^_ ___ ___k ^^fc ___ 

wo aomain, G-beta repeat 


2 . le-21 


84 . 6 


92 


lamii_in_G 


I*amlnin G domain 


6.1e-27 


98.5 




AMP-bmding 


AMP - b x nding enzyme 


2 .4e-l3 


-37 .2 


95 


pkinase 


Eukaryotic protein kinase 
domain 


1.4e-59 


211.4 


96 


pkinase 


Eukaryotic protein kinase 
domain 


2.6e-51 


183 .9 


97 


adh_short 


short chain dehydrogenase 


2e-61 


217.5 


98 


kinesin 


Kinesin motor dornain 


2.2e-86 


300.4 


101 


IRS 


PTB domain (IRS-1 type) 


5.4e-36 


133 .0 


102 


AAA 


ATPases associated with various 
cellular act 


6. 8e-05 


-5.2 


104 


pkinase 


Eukaryotic protein kinase 
domain 


2 . 7e-73 


256.9 


106 


ras 


Ras family 


8.3e-24 


92 .5 


107 


FYVE 


FYVE zinc finger 


5 .4e-27 


100.7 


108 


Cyt^reductas 
e 


FAD/NAD- binding Cytochrome 
reductase 


7.7e-61 


215.5 


109 


zf-C2H2 


Zinc finger, C2H2 type 


2.3e-122 


420.0 


113 


pkinase 


Eukaryotic protein kinase 
domain 


4e-88 


306.2 


116 


PH 


PH domain 


3.1e-ll 


45.2 


117 


lipocalin 


Lipocalin / cytosolic fatty- 
acid binding pr 


2.4e-14 


53.5 


118 


pkinase 


Eukaryotic protein kinase 
domain 


4 ,5e-20 


76 .3 


120 


WD40 


WD domain, G-beta repeat 


2.4e-14 


61.1 


121 


WD40 


WD domain, G-beta repeat 


2 .4e-14 


61.1 


123 


IF5_eIF4__eIF 
2 


eIF4 -gamrna/eIF5/eIF2-epsilon 


le-32 


122.2 


124 


ig 


Immunoglobulin domain 


6 . 5e-08 


30 .6 


127 


mito_carr 


Mitochondrial carrier proteins 


3e-16 


58 .6 


128 


PP2C 


Protein phosphatase 2C 


2 .2e-71 


250.6 


129 


ATP1G1__PLM_M 
AT 8 


ATP1G1/PLM/MAT8 family 


3 .le-20 


80.6 


130 


pfkB 


pfkB family carbohydrate kinase 


4 .Se-42 


137.1 


| 133 


ACBP 


Acyl CoA binding protein 


4 . 6e-22 


86 . 7 


134 


rrra 


RNA recognition motif. 


1 .2e-31 


118 . 5 


135 


IQ 


IQ calmodulin- binding motif 


2 . 6e-08 


41 . 0 


136 


ATP1G1_PI_M_M 
ATS 


ATPlGl/PIiM/MAT8 family 


9 .3e-22 


85 . 7 


139 


WH2 


Wiskott Aldrich syndrome 
homology region 2 


0 .0067 


23 . 1 


140 


zf -C2H2 


m m __r' ^ _j- ■ ___ m _■ j_* . 

Zmc finger, C2H2 type 


1 .7e-82 


287.5 


141 


Peptidase_S2 
6 


Signal peptidase I 


5.7e-10 


35.7 


143 


arf 


ADP-ribosylation factor family 


1.2e-39 


145 . 2 


1 A C 

lib 


KKAB 


KRAB box 


7 . 3e -3 0 


112 . 0 


148 


DUFfi 


Integral membrane protein DUF6 


0.096 


8.0 


149 


PDEase 


3 • 5 ' -cyclic nucleotide 
phosphodiesterase 


3.8e-80 


231.1 


151 


S4 


S4 domain 


l.le-08 


42.3 


153 


tRNA-synt_ld 


tRNA synthetases class I (R) 


3 .8e-103 


356 .1 


154 


Cyt_reductas 
e 


FAD/NAD- binding Cytochrome 
reductase 


7.8e-60 


212.2 


155 


ras 


Ras family 


3 .6e-28 


107 .0 


157 


act: in 


Actin 


3 .8e-26 


87.1 
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SEQ ID 
NO: 


PFAM NAME 


DESCRIPTION 


p-value 


PFAM 
SCORE 


158 


Jacalin 


Jacalin-like lectin domain 


0.09 


-24 . 9 


160 


Zn_carbOpept 


Zinc carboxypeptidase 


Se-138 


471 . 9 


165 


plcinase 


Eukaryotic protein kinase 
domain 


5.1e-67 


236 .1 


167 


2f-C3UC4 


Zinc linger, C3HC4 type (RING 
finger) 


5.3e-07 


27 . 0 


168 


Rxbosomal_Sl 
5 


Ribosomal protein S15 


l.le-06 

■ — 


29.0 


16V 


DEAD 


DEAD/DEAH box helicase 


le-48 j 


157 . 0 


171 


D0F59 


Domain of unknown function 
DUFS9 


0.07 


-17 .4 


172 


plcinase 


Eukaryotic protein kinase 
domain 


3 .7e-15 


58 .6 


173 


globin 


Globin 


4 .6e-18 


67 .4 


174 


WW 


WW domain 


7.3e-06 


32 .9 


175 


ras 


Ras family 


le-31 


118. 8 


178 


ATP1G1_PLM_M 
AT 8 


ATP1G1/PLM/MAT8 family 


2.5e-17 


71 .0 


179 


zf -C2H2 


Zinc finger, C2H2 type 


1.5e-99 


344 . 2 


180 


Clq 


Clq domain 


B . Se-72 


251 .9 


190 


Y phosphacas 
e 


Protein- tyrosine phosphatase 


4 .9e-287 


967 . 0 


191 


ef hand 


EF hand 


7.5e-16 


66.1 


193 


pkinase 


Eukaryotic protein kinase 
domain 


6.5e-82 


285.6 


194 


bromodomain 


B romodoraa in 


5-8e-31 


111 .4 




PALP 


Pyridoxal -phosphate dependent 
enzyme 


2 .5e-64 


227 .1 


197 


DnaJ 


DnaJ domain 


1 . 6e-38 


141 .4 


199 


RrnaAD 


Ribosomal RNA adenine 
dimethylases 


0. 00018 


16.9 


200 


acid phospha 
t 


Histidine acid phosphatase 

■ 


2.5e-10 


37.2 


201 


WH2 


Wiskott Aldrich syndrome 
homology region 2 


0.00048 


26.9 


204 


vATP- 
synt AC39 


ATP synthase (C/AC39) subunit 


1.3e-159 


543 .7 

* 


205 


vATP- 
synt AC39 


ATP synthase (C/AC39) subunit 


1.6e-139 


476 .9 


206 


ldl_recept_a 


Low- density lipoprotein 
receptor domain 


2.4e-25 


97 .6 


209 


ank 


Ank repeat 


1.4e-19 


78.4 


210 


Rhomboid 


Rhomboid family 


\ 0. 0035 


1.2 


211 


Clq 


Clq domain 


1. 6e-70 


247 .7 


212 


UQ_con 


Ubiquit in -conjugating enzyme 


7.4e-74 


258 . 8 


213 


DQ_con 


Ubiquit in- conjugating enzyme 


le-53 


191.9 


215 


DEAD 


DEAD/DEAH box helicase 


1. 8e-43 


140 .4 


216 


PMP2 2_Claudi 
n 


PMP-22/EMP/MP20/Claudin family 


4.5e-21 


83.4 


218 


Glycos trans 
£2 


Glycosyl transferases 


4e-2i 


83.6 


219 


ig 


Immunoglobulin domain 


0. 092 


10 . 7 


222 


WD40 


WD domain, G-beta repeat 


7 .4e-23 


89 .4 


224 


TPR 


TPR Domain 


1 . 2e-08 


42 . 1 


225 


DnaJ CXXCXGX 
G 


DnaJ central domain (4 repeats) 


1.5e-38 


141 .5 


226 


DnaJ_CXXCXGX 
G 


DnaJ central domain (4 repeats) 


1. 5e-38 


141.5 


229 


HSP70 


Hsp70 protein 


2.4e-54 


194 .0 


230 


GSHPx 


Glutathione peroxidases 


! 3.4e-47 


170.2 


231 


tsp 1 


Thrombospondin type 1 domain 


0.0075 


17.1 


233 


cyclin 


Cycl in 


4 . 6e-144 


492 .0 


234 


ras 


Ras family 


4.8e-50 


179.7 


235 


LRR 


Leucine Rich Repeat 


j 1.2e-30 


115.3 


236 


LRR 


Leucine Rich Repeat 


j 6.7e-29 


109.4 


237 


j PDZ 


PDZ domain (Also known as DHR 
or GLGF) . 


1.7e-09 


45.0 
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SEQ ID 

NO: 


PFAM NAME 


DESCRIPTION 


p- value 


PFAM 
SCORE 


244 


dCMP__cyt_dea 
tn 


Cytidine and deoxycytidylate 
deaminase 


2.Se-05 


31.1 


245 


ig 


Immunoglobulin domain 


6 .7e-08 


30.5 


248 


writ 


wnt family of developmental 
signaling protei 


9.1e-270 


742.6 


250 


mito_carr 


Mitochondrial carrier proteins 


1.3e-55 


193.6 


"254 


adenylate;<in 
ase 


Adenylate kinase 


1.8e-14 


55.7 


255 


Cation_ef f lu 

X 


Cation efflux family 


2.8e-33 


124.0 


256 


SH3 


I SH3 domain 


3 . 9e-14 


60.4 


257 


Aa_trans 


Transmembrane amino acid 
transporter protein 


2.6e-52 


187.2 


258 


adenylatekin 
ase 


Adenylate kinase 


2 .le-110 


380.2 


259 


HIT 


HIT family 


8.2e-07 


25.3 


260 


Bacterial PQ 
Q 


! PQQ enzyme repeat 


1.6e-15 


65.0 


262 


proteasome 


Proteasome A- type and B- type 


6 ,5e-64 


225.7 


267 


pkinase 


Eukaryotic protein kinase 
domain 


6 .3e-27 


101.0 


270 


filament 


Intermediate filament proteins 


3 . 2e-150 


512 . 5 


271 


Choi ine_kina 
se 


Choi ine/ethanol amine kinase 


2e-67 


237.4 


277 


Ribosomal_S7 


Ribosoraal protein S7p/S5e 


3 .3e-20 


80.6 


279 


pkinase 


Eukaryotic protein kinase 
domain 


3 .3e-77 


269.9 


280 


WD4 0 


WD domain, G-beta repeat 


7 . 8e-73 


255.4 


281 


WD40 


WD domain, G-beta repeat 


7 . 8e-73 


255.4 


284 


zf-DHHC 


DHHC zinc finger domain 


4.6e-24 


93 .4 


287 


Exonuc lease 


Exonuclease 


1 .4e-67 


23B.0 


291 


SAM 


SAM domain (Sterile alpha 
motif) 


0 .034 


11.2 


292 


SAM 


SAM domain (Sterile alpha 
motif) 


0 . 034 


11.2 


294 


zf-C2H2 


Zinc finger, C2H2 type 


1 .4e-29 


111.7 


295 


zf-C2H2 


Zinc finger, C2H2 type 


2 .2e-12S 


430.0 


296 


raito_carr 


Mitochondrial carrier proteins 


4 . le-59 


205.5 


297 


HMG_box 


HMG (high mobility group) box 


6 . 7e-29 


109.4 


302 


Glycos_trans 
f_4 


Glycosyl transferase 


5e-87 


302.5 


304 


tRNA-synt_2 


fcRUA synthetases class II (D, K 
and N) 


1 .le-84 


294.8 

- 


305 


KRAB 


KRAB box 


2e-44 


161.0 


306 


rrm 


RNA recognition motif. 


2 .7e-44 


160.6 


308 


7tm__l 


7 transmembrane receptor 
(rhodopsin family) 


5 .2e-39 


126.1 


309 


DNAjpolyraera 
seX 


DNA polymerase X family 


2 .4e-64 


227.2 


311 


F-box 


F-box domain. 


9.5e-08 


39.2 


312 


ig 


Immunoglobulin domain 


6 .8e-19 


65.9 


313 


Ets 


Ets -domain 


8 .le-60 


192.3 


315 


Kelch 


Kelch motif 


1 .3e-106 


367.6 


317 


arf 


ADP-ribosylation factor family 


3 .2e-35 


130.4 


318 


sugar_tr 


Sugar (and other) transporter 


0.0003 


-73.1 


320 


pkinase 


Eukaryotic protein kinase 
domain 


8.1e-83 


288.6 


322 


pkinase 


Eukaryotic protein kinase 
domain 


4.9e-81 


282.6 


324 


Xlink 


Extracellular link domain 


4 .5e-143 


331.5 


326 


ARID 


ARID DNA binding domain 


5.1e-37 


136.4 


327 


HMG_box 


HMG (high mobility group) box 


6 ,7e-29 


109.4 


328 


cadherin 


Cadherin domain 


8.le-8l 


281.9 ] 


331 


chromo 


'chromo* (CHRroroatin 
Organization Modifier) 


4e-18 


66.7 


333 


Peptidase M2 
2 


Glycoprotease family 


1.2e-136 


467.4 
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SEQ ID 
NO: 


PFAM NAME 


DESCRIPTION 


p- value 


PFAM 
SCORE 


335 


w WW 4-1. 


von Willebrand factor tvoe A 
domain 


2 . 3e-07 


37 . 9 


339 


ras 


Ras family 


7 .8e-07 


i -59.1 


340 


zf -C2H2 


Zinc finger, C2H2 type 


8 .2e-64 


225.4 


342 


zf -C2H2 


Zinc finger, C2H2 type 


2 .4e-85 


j 297.0 


343 




Immunoglobulin domain 


0.0005 


18.0 


346 


pkinase 


Eukaryotic protein kinase 
domain 


6 .5e-65 


229.1 


347 


pkinase 


Eukaryotic protein kinase 
domain 


6.5e-65 


229.1 


351 


EGF 


EGF-like domain 


8.5e-20 


79.2 


352 


auk 


Ank repeat 


2 .5e-101 


350.0 


354 


TBC 


TBC domain 


S.le-15 


63.3 


355 


PHD 


PHD- finger 


3 .2e-07 


37.4 


358 


DUF6 


Integral membrane protein D0F6 


0.033 


15.8 


359 


zf -C2H2 


Zinc finger, C2H2 type 


7.4e-20 


79.4 


361 


ank 


Ank repeat 


6 -6e-34 


126.1 


362 


ArfGap 


Putative GTP-ase activating 
protein for Arf 


4 .7e-53 


189.7 


363 


ef hand 


EF hand 


5 ,4e-10 


46.6 


367 


XiRR • 


Leucine Rich Repeat 


8.8e-44 


158 .9 


368 

J V U 


i Ami nin * 


LaminAn G domain 


1 . 5e-33 


121 . 7 


369 


PP2C 


Protein DhosDhatase 2C 


5 . 3e- 20 


73 . 9 


i 372 


LIM 


T,TM domain conta* nina orfihr.ins 


9 . 9e~15 


57 . 1 


373 




Krar ho:x" 


4 8e-23 


90.0 . 




L L alio 


■1U11 UlOlldJUUll, ^lUUClli 




-4 . 2 


"* 77 


DCaLU 






704 5 


joy 




domain 

tbM 


-4> « OC ^7 


327 5 


381 


AMP -binding 


AMP-binding enzyme 


1.4e-07 


-140.3 


3 ft? 

«J O A 




transferase ) 


T 3e-07 

•&» » -J v.* v 9 


-13 5 


384 


ank 


Ank reneat 


2 .Se-1-01 


350 . 0 


386 




Immunoolobulin domain 


9 . 5e-05 


23 . 6 


388 


Zf -C2H2 


Zinc fincrer, C2H2 tvoe 


1 .7e-42 


154 . 6 


389 


io 


Immunoglobulin domain 


2 . 8e-15 


54 .3 


390 


mi to can 


Mitochondrial carrier proteins 


3 ,5e-67 


233 . 2 


3 92 


TPR 


TPR Domain 


6.1e-17 


69.7 


3 93 


SH3 


SH3 domain 


3 .5e-09 


43 .9 


3 94 


AAA 


ATPases associated with various 
cellular act 


4 .le-21 


83 .6 


396 


spectrin 


Spectrin repeat 


2.1e-67 


23 7.3 


397 


zf-C2H2 


Zinc finger, C2H2 type 


0 . 0066 


23 .1 


399 


fn3 


Fibronectin type III domain 


4 .le-102 


352.6 


400 


WD40 


WD domain, G-beta repeat 


0.00049 


26.8 


401 


El dehydrcg 


Dehydrogenase El component 


3e-119 


409.6 


402 


fn3 


Fibronectin type III domain 


0 


1719.6 


404 


LRR 


Leucine Rich Repeat 


2 .le-10 


48 . 0 


4 OS 


cadherin 


Cadherin domain 


8.ie-81 


281. 9 


406 


zf-GXXC 


CXXC zinc finger 


5e-15 


63.4 


410 


RhoGBF 


RhoGEF domain 


l.le-23 


92.1 


411 


F-box ! 


F-box domain. 


4 -2e-06 


33.7 


412 


SNF2_N 


SNF2 and others N- terminal 
domain 


5-8e-16 


61.6 


415 


CPSase L cha 
in 


Carbaraoyl-phosphate synthase 
(CPSase) 


1.5e-172 


586.6 


418 


LRR 


Leucine Rich Repeat 


3 .8e-24 


93.6 


419 


DENN 


DENN (AEX-3) domain 


2e-58 


207.5 


420 


RasGEF 


RasGEF domain 


8.1e-43 


155.7 


421 


ank 


Ank repeat 


1.4e-153 


523.7 


424 


G -patch 


G-patch domain 


le-19 


78 .9 


425 


pkinase 


Eukaryotic protein kinase 
domain 


2.2e-31 


117.1 


426 


PI ex in repea 
t 


Plexin repeat 


0.0023 


24 .6 


427 


Plexin_repea 


Plexin repeat 


0.0023 


24.6 
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SEQ ID 
NO: 


PFAM NAME 


UK £> {^H I FTI ON 


p -value 


PFAM 
SCORE 




t 








429 


zf-C3HC4 


Zinc finger, C3HC4 type (RING 
finger) 


8 .6e-ll 


39.2 


431 


DEAD 


DEAD/DEAH box helicase 


le-66 


214 .0 


432 


on 3 


SH3 domain 


3 . 4e-16 


67.2 


433 


GTP_CDC 


Cell division protein 


2 . le-114 


393 .5 


43b 


Collagen 


Collagen triple helioc repeat 
(20 copies) 


4 . 6e-194 


658 .1 


438 


Ri cx n_B_i ecc 
in 


Similarity to lectin domain of 
ricin b 


0 . 0085 


10.5 


! A A 1 


-A-ipCa aaapC X 

— 


Alpha ad apt In carboxyl- terminal 
domai 


1 . 2e-2o6 


866 . 0 


442 


Alpha_adap t i 
n v. 


Alpha adaptin carbcocyl- terminal 
domai 


1.8e-235 


795.7 


443 


PDZ 


PDZ domain (Also known as DHR 
or GLGF) . 


1 . 9e-65 


230 . 9 


445 


LON 


ATP- dependent protease La (LON) 
domain 


0 . 00012 


-17.1 


446 




Immunoglobulin domain 


0 . 00011 


20 . 1 


,451 


sushi 


Sushi domain (SCR repeat) 


1 .4e-18 


75.2 


i 452 


fn3 


Fibronectin type III domain 


1 . 5e-06 


35.2 


454 


pyridoxal_de 
C 


Pyridoxal - dependent 
decarboxylase conse 


6 .3e-14 


50.3 


456 


kinesin 


Kinesin motor domain 


4 . 9e-217 


734.4 


457 


neur_chan 


Neurotransmitter -gated ion- 
channel 


le-175 


597.1 


458 


Joseph in 


josephin 


0 . 0002 


18 . 7 


468 


bZIP 


bZIP transcription factor 


1 .7e-07 


31.8 


470 


NTP_t rans f ex 
ase 


Nucleotidyl transferase 

• 


6 -3e-06 


-26 .3 


471 


WD40 


WD domain, G-beta repeat 


2e-28 


107.9 


473 


LIM 


LIM domain containing proteins 


0 .00021 


20.7 


477 


zf-RanBP 


Zn- finger in Ran binding 
protein and others . 


0.028 


21. 0 


479 


WD40 


WD domain, G-beta repeat 


6 .5e-18 - 


73 . 0 


i 480 


KRAB 


KRAB box 


le-31 


118 .8 


{ 481 
1 


ArfGap 


Putative GTP-ase activating 
protein for Arf 


B.4e-6S 


232 .0 


1 485 


SH2 


Src homology domain 2 


0.011 


11 .4 


486 


Clq 


Clq domain 


4.3e-74 


259.6 


487 


dszrm 


Double-stranded RNA binding 
motif 


l.le-47 


171 . 9 



489 


zf-C2H2 


Zinc finger, C2H2 type 


4 . 8e-153 


521 . 9 


490 


Al pha_a dap 1 i 
n_C 


Alpha adaptin carboxyl- terminal 
aomai 


3 .4e-222 


7S1 . 6 


492 


SKI 


Shikimate kinase 


l.2e-10 


48 . 8 


497 


SNVpo I yt>r o t 
ein 


ENV polyprot e in ( coa t 
polyprotein) 


2 . 6e-22 


77 . 6 

* 




abhydro 1 ase 


Phospholipase/ Carboxylest erase 


0 - 041 


/ID 1 


500 


rrm 


RNA recognition motif. 


5.4e-34 


126.4 


501 


WW 


WW domain 


4 . 6e-l8 


73 . 4 


502 




Immunoglobulin domain 


l.le-10 


39.5 


504 


abhydrolase 


alpha/beta hydrolase fold 


0 .045 


-3 . 6 


505 


vwa 


von Willebrand factor type A 
domain 


7.1e-62 


219.0 


508 


Na_K_ATPase_ 

c 


Na+/K+ ATPase C- terminus 


2.3e-145 


496.3 


509 


Exonuclease 


Exonuclease 


1.3e-56 


201. 5 


510 


Glycos trans 
f_l 


Glycosyl transferases group 1 


2.9e-06 


27.0 


511 


Glycos trans 
f_l 


Glycosyl transferases group 1 


2.9e-06 


27.0 


512 


Glycos trans 
f_l 


Glycosyl transferases group 1 


1.9e-09 


38.5 


514 


p r o__i s orae r as 
e 


Cyclophilin type pep t idyl - 
prolyl cis-tr 


1. 8e-63 


221 .4 
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SEQ ID 

NO: 


PPAM NAME 


DESCRIPTION 


p-value 


PKAM 
SCORE 


515 


EGF 


EGF- like domain 


1.9e-i8 


74 .7 


S16 


Surp 


Surp module 


4 .3e-38 


140 . 0 


523 


i9 


Immunoglobulin domain 


3 .3e-06 


25 .0 


526 


UBX 


UBX domain 


l.le-34 


128 .6 


528 


adh_zinc 


Zinc-binding dehydrogenases 


2.7e-34 


127 .4 


530 


SAM 


SAM domain (Sterile alpha 
motif) 


0.046 


10.0 


531 


adh__short 


short chain dehydrogenase 


0.0025 


-34 . 1 


532 


ml to carr 


Mitochondrial carrier proteins 


2.5e-81- 


281.7 


533 


mi to carr 


Mitochondrial carrier proteins 


2e-6l 


213 .5 


534 


thiolase 


Thiolase 


3.5e-183 


622 .0 


535 


FMO-like 


Flavin -binding monooxyg enase- 
like 


0 


1153.7 


536 


SCAN 


SCAN domain 


4e-55 


196 .6 


53 7 


tRNA-synt_l 


tRNA synthetases class I (I, I*, 
M and V) 


3.1e-136 


466 .0 


538 


tRNA-synt_l 


tRNA synthetases class I (I, L, 
M and V) 


3 .le-136 


466 .0 


539 


tRNA-synt_l 

■ 


tRNA synthetases class I (I, L» f 
M and V) 


1.9e-117 


403 .6 


540 


tRNA-synt_l 


tRNA synthetases class I (I, I*, 
M and V) 


3 .le-136 


466 .0 


541 


vATP-synt_S 


ATP synthase (E/31 kDa) subunit 


5.9e-85 


295.7 


543 


Zf-C2H2 


Zinc finger, C2H2 type 


5.5e-€9 


242 .6 


544 


DUF101 


Protein of unknown function 
DUF101 


8 .5e-38 


139 .0 


545 


TGFb propept 

" XT A 

ide 


TGF-beta propeptide 


l.le-67 


238 .2 


547 


WD40 


WD domain, G-beta repeat 


2 .6e-32 


120.8 


548 


RHD 


Rel homology domain (RHD) . 


. 1.6e-238 


686 .2 


54 9 


MMR_HSR1 


GTPase of unknown function 


5.4e-67 


236 .0 


551 


HECT 


HECT-domain (ubiguitin- 
transf erase) . 


4 .3e-127 


435 .6 


554 


MHC_II_alpha 


Class II histocompatibility 
antigen, alp 


3 .5e-74 


259 . 8 


555 


2f-UBRl 


Putative zinc finger in N- 
recognin 


3 .3e-16 


67.3 


556 


Kelch 


Kelch motif 


5.5e-29 


109.7 


561 


AMP -binding 


AMP -binding enzyme 


2.8e-06 


-163.7 


562 


PABP 


Poly- adenylate binding protein, 
unique domai 


4 .9e-38 


13 9.8 


564 


Gag_p3 0 


Gag P30 core shell protein 


1.2e-67 


238 .2 


566 


PWWP 


PWWP domain 


8.1e-16 


66.0 


567 


SCAN 


SCAN domain 


7.3e-68 


238.9 


569 


pkinase 


Eukaryotic protein kinase 
domain 


1.5e-84 


294.3 


570 


pkinase 


Eukaryotic protein kinase 
domain 


1.5e-84 


294 .3 


571 


CN_hy drol a s e 


Carbon-nitrogen hydrolase 


0 .00081 


-79 .7 


5 72 


myos i n_he a d 


Myosin head (motor domain) 


0 


1495.2 


573 


myosin^ head 


Myosin head (motor domain) 


0 


1490 .4 


575 


Surp 


Surp module 


1 .7e-23 


91. S 


576 


Surp | 


Surp module 


1.7e-23 


91.5 


577 


DNA__pol_B 


DNA polymerase family B 


0 


1138.6 


578 


PDZ 


PDZ domain (Also known as DHR 
or GLGF) . 


8.3e-09 


42.7 


579 


LRR 


Leucine Rich Repeat 


4 .9e-21 


83 .3 


580 


neur__chan 


Neuro transmitter-gated ion- 
channel 


S.9e-177 


601.3 


583 


sushi 


Sushi, domain (SCR repeat) 


0 


1673 -0 


j 584 


DEAD 


DEAD/DEAH box helicase 


7.3e-36 


116 .3 


586 


KH- domain 


KH domain 


2.9e-13 


57 .5 


587 


G-patch 


G -patch domain 


2.3e-14 


61.2 


589 


LIM 


LIM domain containing proteins 


2.3e-36 


133.4 


590 


broraodomain 


Bromodomain 


6.6e-32 


114 .7 


591 


broraodomain 


Bromodomain 


6 .6e-32 


114 . 7 
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SEQ ID 

NO: 


PPAM NAME 


DESCRIPTION j 


p- value 


PFAM 
SCORE 


532 


hormone_rec 


Ligand- binding domain of 
nuclear hormone 


3 .5e-22 


87 .1 | 


593 


PHD 


PHD- finger 


3 .8e-l2 


53.8 


594 


cadherin 


Cadherin domain 


4.2e-99 


342 .7 


596 


pkinase 


Eukaryotic protein kinase 
domain 


5e-92 


319.2 


597 


WD40 


WD domain, G-beta repeat 1 


0.00054 


26 .7 


600 


FG-GAP 


FG-GAP repeat 


4.3e-75 


262.9 


602 


G_Adapt_CT 


Gamma -adapt in, C- terminus 


l.le-53 


191.8 


603 


pkinase 


Eukaryotic protein kinase 
domain 


2.3e-86 


300 .4 


605 


Collagen 


Collagen triple helix repeat 
(20 copies) 


8e-42 


152.4 


606 


mito_carr 


Mitochondrial carrier proteins 


6.3e-67 


232.3 


608 


PWWP 


PWWP domain 


2.6e-28 


107.5 


609 


PWWP 


PWWP domain 


2.6e-28 


107.5 


613 


CAP GliY 


CAP-Gly domain 


0.0046 


20 .1 


615 


RFXJ3NA_bind 
ing 


RFX DNA- binding domain 


5.2e-54 


192 .9 


616 


kinesin 


Kinesin motor domain 


i.le-81 


284 .8 


£17 


kinesin 


Kinesin motor domain 


8 .4e-fi0 


278.5 


618 


zf -C3HC4 


Zinc finger, C3HC4 type (RING 
finger) 


0.0098 


13.1 


620 


MATH 


MATH domain 


7 _8e-0S 


22 .2 


621 


Y ohosphatas 
e 


Prote in - tyros ine phospha t ase 


1.4e-32 


121.6 


622 


pkinase 


Eukaryotic protein kinase 
domain 


4.4e-40 


146 .6 


623 


BNR 


BNR repeat 


2. le-11 


51-3 


624 


roolybdopteri 
n 


Prokaryotic molybdopterin 
oxidoreductas 


1.4e-12 


42.2 


625 


TPR 


TPR Domain 


l.le-17 


72 .2 


627 


cNMP_binding 


Cyclic nucleo tide-binding 

domain 


3 .7e-58 


206 .6 


630 


adb_short 


short chain dehydrogenase 


5e-17 


70.0 


631 


zf -C2H2 


Zinc finger, C2H2 type 


2 . le-88 


307.1 


632 


rrm 


RNA recognition motif . 


4e-05 


30.5 


635 


pkinase 


Eukaryotic protein kinase 
domain 


1.6e-104 


360.7 


636 


Fork head 


Fork head domain 


5 . 9e-27 


103 .0 


637 


pkinase 


Eukaryotic protein kinase 
domain 


3.8e-70 


246 .5 


642 


TPR 


TPR Domain 


4 . 8e-08 


40.1 


643 


efhand 


EF hand 


1.9e-27 


104 .6 


647 


SNF2_N 


SNF2 and others N- terminal 
domain 


1.2e-101 


351.1 


648 


PseudoU synt 
h 2 


RNA pseudouridylate synthase 

• 


1.9e-55 


197.6 


650 


zf-C2H2 


Zinc finger, C2H2 type 


0 . 0087 


22 .7 


651 


ank 


Ank repeat 


1 .3e-17 


"71 . 9 


652 


I LWEQ 


I/LWEQ domain 


9.5e-101 


341.0 


653 


neur_chan 


Neurotransmitter-gated ion- 
channel 


4 .le-171 


581 . 8 


654 


tsp_l 


Thrombospondin type l domain 


4 .le-47 


169.9 


659 


FH2 


Formin Homology 2 Domain 


le-107 


371 .2 


661 


pou 


Pou domain - N- terminal to 
homeobox domain 


5.3e-45 


162.9 
— 


662 


C2 


C2 domain 


6 .7e-19 


76 .2 


663 


C2 


C2 domain 


6.7e-19 


76 .2 


664 


C2 


C2 domain 


6.7e-19 


76 .2 


€^7 


GST 


Glutathione S-transf erases . 


9.3e-34 


114 .4 


668 


LRR 


Leucine Rich Repeat 


9.3e-31 


115.6 


670 


spectrin 


Spectrin repeat 


4e-57 


203 .2 


671 


I LWEQ 


X/LWEQ domain 


9.5e-101 


341.0 


672 


ABC tran 


ABC transporter 


5.3e-60 


212 . 8 


674 


WD40 


WD domain, G-beta repeat 


4.8e-24 


93.3 
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SEQ ID 
NO: 


PFAM NAME 


DESCRIPTION 


p-value 


PFAM 
SCORE 


675 


WD40 


WD domain, G-beca repeat 


4 .8e-24 


93.3 


676 


LRR 


Leucine Rich Repeat 


0 .0015 


25.2 


j 679 


zf-CCCH 


Zinc finger C-x8-C-x5-C-x3-H 
type 


2.6e-29 


107 .7 


660 


zf -C2H2 


Zinc finger, C2H2 type 


S.2e-05 


30.1 


681 


CH 


Calponin homology (CH) domain 


2 .4e-17 


71.1 


682 


DSPc 


Dual specificity phosphatase, 
catalytic doma 


4.3e-43 


1S6.6 


683 


zf-C3HC4 


Zinc finger, C3HC4 type (RING 
finger) 


0.O51 


10.8 


687 


Synapsin 


Sy naps in 


0 


1890.8 


689 


PR55 


Protein phosphatase 2A 
regulatory subunit PR 


0 


1038 .8 


69X 


homeobox 


Homeobox domain 


8.5e-30 


112 .4 


696 


Peptidase_M2 
4 


metal lopeptidase family M24 


2.6e-59 


210 .5 


697 


RhoGEF 


RhoGEF domain 


9.5e-35 


128.9 


698 


PHD 


PHD- finger 


0.008 


9.3 


701 


zf-C2H2 


Zinc finger, C2H2 type 


5.5e-123 


422.0 


702 


Sulf atase 


Sulf atase 


3e-231 


781.6 


703 


zf -C2H2 


Zinc finger, C2H2 type 


5.7e-20 


79.8 


707 


Acyl_transf 


Acyl transferase domain 


1. le-22 


88 .8 


708 


WD4 0 


WD domain, G-beta repeat 


4 .8e-19 


76.7 


710 


Ran_BPl 


RanBPl domain . 


8.4e-06 


-7.3 


713 


DEAD 


DKAD/DKAH box helicase 


9 .9e-42 


134 .9 


714 


PH 


PH domain k 


1 .6e-09 


39.0 


715 


DSPc 


Dual specificity phosphatase, 
catalytic doma 


1 . 5e-37 


138.2 


717 


Sialyltransf 


Sialyltransferase family 


7 .5e-31 


115.9 


718 


lg 


Immunoglobulin domain 


le-29 


100.8 


719 


integrin_B 


Integrins, beta chain 


0 


1125.4 


720 


zf -C3HC4 


Zinc finger, C3HC4 type (RING 
finger) 


l.le-08 


32.4 

■ 


722 


Peptidase_C2 


calpain family cysteine 
protease 


3e-145 


495-9 


723 


ig 


Immunoglobulin domain 


2 .2e-05 


22 .4 


724 


F-box 


F-box domain. 


0 .007 


23 .0 


725 


Nop 


Putative snoRNA binding domain 


8 .le-58 


205.5 


726 


Nop 


Putative snoRNA binding domain 


8 .le-58 


205.5 


727 


WD4 0 


WD domain, G-beta repeat 


7 .5e-26 


99.3 


730 


dsrm 


Double- Stranded RNA binding 
motif 


0 .027 


12.1 


731 


dynamin 


Dynamin family 


4 .2e-i6 


66 .9 


733 


zf-CCCH 


Zinc finger C-x8-C-x5-C-x3-H 
type 


2 .8e-10 


41.7 


735 


CDP- 

OH P transf 


CDP- alcohol 

phosphatidyl transferase 


4 .2e-26 


100.1 


738 


DEAD 


DEAD/DEAH box helicase 


8 .6e-57 


182 . 5 


739 


TSC22 


TSC-22/dip/bun family 


6 .Se-32 


119.5 


742 


ras 


Ras family 


2 .2e-100 


346.9 


743 


PMI_ typel 


Phosphomannosc isomerase type I 


1 ,2e-243 


822 . 9 


747 


trypsin 


Trypsin 


6 .4e-88 


279 .4 


748 


kazal 


Kazal -type serine protease 
inhibitor domain 


2 .2e-52 


187 .4 


749 


efhand 


EF hand 


6 .3e-06 


33 .1 


751 


PHD 


PHD- finger 


4 .9e-l6 


66 .7 


752 


zf-C2H2 


zinc finger, C2H2 type 


3 -2e-21 


83 .9 


753 


Hydrolase 


haloacid dehalogenase-like 
hydrolase 


6.1e-ll 


49. 8 


754 


Ribosomai_L3 
9 


Ribosomal 1*39 protein 


0 .00018 


26.7 


755 


PH " 


PH domain 


3 .6e-14 


55 .7 


758 


SCAN 


SCAN domain 


1.4e-53 


191.5 


759 ] 


PA 


PA domain 


0 .0065 


23 .1 


760 


arf 


ADP-ribosylation factor family 


2 .2e-l9 


77.8 


761 1 


CIDB-N 


CIDE-N domain 


2.2e-40 


147.6 
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SBQ ID 
NO: 


PFAM NAME 


DESCRIPTION 


p- value 


PFAM 
SCORE 


762 


hi st one 


Core histone H2A/H2B/H3/H4 


9.9e-53 


188.6 


763 


zf-MYND 


MYRD finger 


4 .le-14 


60.3 


764 


pou 


Pou domain - N- terminal to 
homeobox domain 


le-52 


188.6 


767 


vwc 


von Willebrand faczor type C 
domain 


2 .9e-34 


127.3 


769 


ef hand 


EF hand 


4 .8e-ll 


50.1 


770 


zf-C4 


Zinc finger, C4 type (two 
domains) 


2 .4e-53 


181.6 


772 


ras 


Ras family 


7e-90 


312.0 


773 


Sulfa tase 


Sulf atase 


le-142 


487.5 


775 


zf-C2H2 


Zinc finger, C2H2 type 


l.le-12 


55.5 ~ 


776 


zf-C2H2 


Zinc finger, C2H2 type 


l.le-12 


55.5 


777 


Zf-C2H2 


zinc finger, C2H2 type 


l.le-12 


55.5 


778 


i. 1 Ui 


RNA recognition motif. 


2 .le-32 


121.1 


j 779 

i 
• 


G6PD 


Glucose- 6 -phosphate 
dehydrogenase 


1.5e-76 


236.6 


! 780 


spectrin 


Spectrin repeat 


3 .7e-29 


110.3 


781 


mito^carr 


Mitochondrial carrier proteins 


4 -6e-S7 


198.5 


782 


SCAN 


SCAN domain 


1.3e-24 


95.2 


783 


PDZ 


PDZ domain (Also known as DHR 
or GU3F) . 


4.le-07 


37.1 


785 


DEAD 


DEAD/DEAH box helicase 


6e-06 


21.7 


786 


ras 


Ras family 


5 .3e-39 


143.0 


787 


RNase HI I 


Ribonuclcase HII 


2 ,5e-67 


237.1 


790 


PI3_PI4_kina 
se 


Phosphatidyl inositol 3- and 4- 
kinases 1 


5.4e-108 


372 .2 


795 


cadherin 


Cadherin domain 


2 .5e-40 


147.4 


796 


ARID 


ARID DNA binding domain 


1.6e-20 


81.6 


797 


trypsin 


Trypsin 


9.9e-20 


64.8 


799 


CH 


[ Calponln homology (CH) domain 


3 .7e-15 


63 . 8 


801 


Gal- 
bind lectin 


Vertebrate galactoside-binding 
lectin 


4 .le-25 


88 .7 


803 


WD40 


WD domain, G-beta repeat 


0.00082 


26 . 1 


806 


TBC 


TBC domain 


1.8e-26 


101.4 


807 


TBC 


TBC domain 


1.8e-26 


101 .4 


808 


CN_hydrolase 


Carbon-nitrogen hydrolase 


8.8e-80 


278 .5 


811 


CBFD^NFYB_HM 
F 


Hi st one -like transcription 
factor 


6e-14 


59.8 


812 


adh_short 


short chain dehydrogenase 


8.1e-20 


79 .3 


1 814 


IMP4 


Domain of unknown function 


3 .3e-71 


250.0 


815 


zf-C2H2 


Zinc finger, C2H2 type 


8.2e-66 


232.1 


816 


Pept__tRNA_hy 
dro 


Peptidyl-tRNA hydrolase 


1.6e-37 


138.0 


817 


ARID 


ARID DNA binding domain 


2.5e-18 


74 .3 


826 


IFS_eIF4 elF 
2 


e I F 4 - gamma / e I F5/ e I F2 - eps i 1 on 


1.6e-32 


121 .5 


830 


ArfGap 


Putative GTP-ase activating 
protein for Arf 


1.5e-53 


191.3 


831 


LRR 


Iieucine Rich Repeat 


2.1e-26 


101,1 


632 


laminin_EGF 


Iiaminin EGF-like (Domains III 
and V) 


2e-57 


204 .2 


839 


rrtn 


RNA recognition motif. 


1.3e-22 


88. 5 


84 0 


Y_phosphatas 
e 


Protein- tyrosine phosphatase 


2.6e-119 


409!8 


841 


pkinase 


Eukaryotic protein kinase 
domain 


3 ,4e-100 


346 .3 


844 


Ribosomal L2 
2e 


Ribosomal L22e protein family 


le-64 


228 .4 


846 


IBR 


IBR domain 


9e-15 


62.5 


849 


zf-C3HC4 


Zinc finger, C3HC4 type (RING 
finger) 


7.4e-07 


26.5 


850 


2f-C3HC4 


Zinc finger, C3HC4 type (RING 
finger) 


0.00016 


18.9 ! 


851 


SET 


SET domain \ 


5e-30 


113.2 


852 


SRCR 


Scavenger receptor cysteine- j 


0 


1025.4 
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SEQ ID \ 
NO: 


Pr AM NAME 


DESCRIPTION 


p-value 


PFAM 
SCORE 






rich domain 






853 


SRCR 


Scavenger receptor cysteine- 
rich domain 


0 


1025.4 


857 


lactaraase_B 


Metal lo-beta - lactamase 
super family 


0 .012 


-6 . 0 


B58 


COX6A 


Cytochrome c oxidase subumt 
Via 


3 .4e-58 


206 .7 


859 


, JTXUI 


RNA recognition motif . 


5.4e-45 


162 .9 


861 


PRK 


Phosphoribulokinase 


S.ie-62 


219.4 [ 


863 


mi to carr 


Mitochondrial carrier proteins 


2 .9e-53 


185.5 


864 


HSP90 


Hsp90 protein 


4 .7e-158 


538.5 


866 


ig 


Immunoglobulin domain 


4e-12 


44 .1 


867 


zf -C2H2 


Zinc finger, C2H2 type 


7e-135 


461.5 


872 


hi stone 


Core histone H2A/H2B/H3/H4 


4 .9e-41 


149 . 8 


874 


CPSase !• cha 
in 


Carbamoyl -phosphate synthase 
(CPSase) 


2.1e-218 


739.0 


879 


Ribosomal SI 
2e 


Ribosomal protein S12e 


2.le-98 


340 .3 


882 

U W *H* 


seroin 


Serpins (serine protease 
inhibitors) 


2.5e-42 


145.7 j 


883 


Pata tin 


Pa tat In 


1 .2e-Sl 


182 .0 


884 


RA 


Ras association (RalGDS/AF- 6 ) 
domain 


0 .044 


8.0 


887 


DUF92 


Integral membrane protein DUF92 


2.7e-12 


S4 .3 


889 


sucrar tr 


Sugar (and other) transporter 


8 .2e-63 


222 .1 


893 


DUF2 8 


Domain of unknown function 
DUF28 


1.3e-43 


158.3 


896 


IP trans 


Phosphatidyl inositol transfer 
protein 


6.5e-98 


338.7 


898 


DEAD 


DEAD/DEAH box he 1 lease 


1.5e-48 


156 .5 


899 


KE2 


KE2 family protein 


7e-61 


215 .7 


900 


KE2 


KE2 family protein 


4 .3e-51 


183 .2 


901 


zf -C2H2 


Zinc finger, C2H2 type 


2.7e-57 


203 .8 


902 


ras 


Ras family 


2.3e-75 


263 .8 


904 


TPR 


TPR Domain 


3 .2e-22 


87.2 


906 


GBP 


Guanylate -binding protein 


8.9e-253 


853 .1 


907 


GBP 


Guanylate-binding protein 


l.le-239 


809 .6 


908 


WD4 0 


WD domain, G-beta repeat 


2.6e-26 


100 .8 


909 


PH 


PH domain 


1 .3e-09 


39.4 


910 


zf-C2H2 


Zinc finger, C2H2 type 


2.5e-3 9 


144 .1 


913 


Epimerase 


NAD dependent 

epimerase /dehydratase family 


5e-07 


-88.5 


921 


TBC 


TBC domain 


1 ,5e-09 


30.7 


922 


WD40 


WD domain, G-beta repeat 


1 .6e-25 


[ 98.2 


923 


WD4 0 


WD domain, G-beta repeat 


8 .2e-07 


36.1 


924 


Hydrolase 


haloacid dehalogenase-like 
hydrolase 


2.9e-05 


29.1 


925 


UQ__ con 


Ubiguit in- conjugating enzyme 


0. 00033 


-27.6 


926 


CH 


Calponin homology (CH) domain 


3 .3e~53 


190.2 


928 


WD40 


WD domain, G-beta repeat 


5 . 9e-48 


172.7 


929 


zf-C3HC4 


Zinc finger. C3HC4 type (RING 
finger) 


3 .le-10 


37.4 


930 


Ribul P 3 ep 
im 


Ribulose -phosphate 3 epimerase j 
family 


7 .2e-105 


361.8 


931 


Ribul_P_3_ep 
im 


Ribulose -phosphate 3 epimerase 
family 


1.2e-96 


334 .4 


936 


C2 


C2 domain 


2 . 2e-62 


220 .7 


937 


NAP_family 


Nucleosome assembly protein 
(NAP) 


1 .le-22 


84.6 


940 


abhydrolase 


alpha /bet a hydrolase fold 


0.011 


3.1 


944 


Tr opomyo sin 


Tropomyosins 


3 .2e-07 


1 25.1 


948 


pkinase 


Eukaryotic protein kinase 
domain 


3 .4e-75 


263 .2 


949 


WD40 


WD domain, G-beta repeat 


1. 8e-27 


104 .7 


950 


Acyl transfer 
ase 


Acyl trans i erase 


1 .6e-07 


38.4 



256 



QMcrwifv /wn 



01S3312A1 I > 



WO 01/53312 



PCT/US00/34263 



SEQ ID 

NO: 


PFAM NAME 


DESCRIPTION 


p-value 


PFAM 
SCORE 


951 


SAM 


SAM domain (Sterile alpha 
motif) 


0 .014 


14 . 5 


954 


GFO IDH MocA 


Qxidoreductase family * 


1.3e-ll 


52.0 


955 


BTB 


BtB/POZ domain 


7e-22 


86.1 


956 


BTB 


BTB/POZ domain 


7e-22 


86.1 


957 


CDP- 

OH_P_t rans f 


<su&- alcohol 

phosphatidyl transferase 


0-053 


-22 .2 


959 


ras 


Ras family 


2.4e-97 


336 .8 


f r\ 


ras 


Ras family 


8.4e-43 


155 .6 


961 


Ace ty 1 1 r a ns f 


Acetyl transferase (GNAT) family 


1.2e-08 


42.2 


962 


adh_short 


snort chain dehydrogenase 


2.4e-31 


117 .6 


963 


mutT 


Bacterial mutT protein 


5.6e-06 


26.2 


969 . 


IF-2B 


Initiation factor 2 subunit 
family 


8 ,4e-193 


653 .9 


970 


RNase_PH 


3» exoribonuclease family 


9e-24 


92.4 


975 


WW 


WW domain 


5.7e-2S 


96.4 


977 


PDZ 


PDZ domain (Also known as DHR 
or GI»GF) . 


3 .6e~21 


83.7 


978 


Ribosomal Li 
7 


Ribosomal protein L17 


2 .4e-20 


81.0 


979 


1»IM 


LIM domain containing proteins 


5. 8e-42 


152 . 8 


980 


Calsequestri 
n 


Calsequestrin 


1.7e-297 


1001.7 


982 


HSP20 


Hsp20/alpha crystallin family 


1 .2e-10 


43.2 


983 


oxidored_q6 


NADH ubiquinone oxidoreductase, 
20 Kd sub 


4 .8e~63 


222.9 - 


988 


TBC 


TBC domain 


2 .2e-50 


180.8 


989 


TBC 


TBC domain 


2.2e-50 


180 .8 


993 


tRNA_int_end 
o 


tRNA intron endonuclease 


0.0017 


-34.2 


994 


homeoboic 


Homeobox domain 


4e-18 


73.6 


997 


pyr_redox 


Pyridine nucleotide-disulphide 
oxidoreducta 


0.012 


11.6 


1000 


mito_carr 


Mitochondrial carrier proteins 


9.7e-123 


421.2 


1001 


RA 


Ras association (RalGDS/AF-6) 
domain 


1.2e-15 


$5.4 


1004 


• DUF81 


Domain of unknown function 
DUF81 


0.099 


10.2 


1005 


actin 


Actin 


1 .3e-l74 


574 .3 


1006 


actin 


Actin 


3 .le-130 


428 .6 


1007 


cpn6 0_TCPl 


TCP-l/cpn60 chaperonin family 


3 .7e-195 


661.8 


1008 


TPR 


TPR Domain 


8 .le-44 


159.0 


1009 


zf -C2H2 


Zinc finger, C2H2 type 


3 .6e-61 


216.6 


1011 


zf-C2H2 


Zinc finger, C2H2 type 


3 .6e-61 


216.6 


1012 


Zf-C3HC4 


Zinc finger, C3HC4 type (RING j 
finger) 


4 .7e-15 


53 .1 


1016 


tRNA- eynt_2 c 


tRNA synthetases class II (A) 


2.3e-15 


55.2 


101B 


RhoGAP 


RhoGAP domain 


1 .6e-78 


274 .3 


1022 


PGAM 


Fhosphoglycerate mutase family 


3 .8e-18 


69 .7 j 


1026 


HMG_bOX 


HMG (high mobility group) box | 8.4e-20 


79 .2 


1027 


TBC 


TBC domain 


7.3e-45 


162.5 j 


1028 


UQ_con 


Ubiquitin- conjugating enzyme 


1 .4e-49 


178.1 


1032 


PDZ 


PDZ domain (Also known as DHR 
or GLGF) . 


0 .028 


16 .3 


1034 


Hydrolase 


haloacid dehalogenasc-like 
hydrolase 


2e-21 


84 .6 


1037 


KRAB 


KRAB box 


4 ,8e-06 


32 .4 


1038 


Cation_efflu 

X 


Cation efflux family 


7.1e-42 


152.5 


1040 


ART 


NAD : arginine ADP- 
ribo syl t rans f era se 


4.7e-47 


169.1 


1042 


WD40 


WD domain, G-beta repeat 


1 .9e-l8 


74 . 7 


1043 


zf-C2H2 


Zinc finger, C2H2 type 


3 . 7e-24 


93 . 7 


1045 [ 


lectin c 


Lectin C- type domain 


1.9e-28 


108 .0 


1046 


Glucosamine^ 
iso 


Glucosamine - 6 - phospha t e 
isomerase \ 


0.00013 


-25.1 
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SEQ ID 
NO: 


PFAM NAME 


DESCRIPTION 


p-value 


PFAM 
SCORE 


1047 


ligase-CoA 


CoA-ligases 


4 .5e-80 


279 .4 


1049 


ig 


Immunoglobulin domain 


1.7e-09 


35 . 6 


1050 


Ribosomal_L2 
4e 


Ribosomal protein L24e 

• 


2e-33 


124 .5 


1054 


Amldase 


Ami das e 


4-3e-152 


518 .7 


1055 


rXTQ 


RNA recognition motif. 


3 . 8e-26 


100.3 


1058 


annexin 


Annexin 


6.9e-44 


159 .2 


1059 


PMP22_CJLaudi 
n 


PMP-22/EMP/MP20/Claudin family 

- 


0.023 


-23 .6 


1060 


homeobox 


Home ob ox domain 


3.2e-31 


117 .2 


1062 


Acyl transfer 
ase 


Acyl transferase 


0.00065 


10.5 


1064 


AMP-binding 


AMP -binding enzyme 


6.6e-100 


345.3 


106S 


LRR 


Leucine Rich Repeat 


3 .3e-14 


60. 6 


1066 


GTP1 OBG 


GTP1/0BG family 


4 .8e-41 


141.8 


1071 


i3 


Immunoglobulin domain 


8 . 4e-48 


159.1 


1072 


PHD 


PHD- finger 


6.8e-07 


36.3 


1074 


DENN 


DENN (AEX-3) domain 


8.3e-33 


121 .5 


1075 


SCP 


SCP- like extracellular protein 


4 .7e-41 


149.8 


1077 

* 


OLF 


Olfactomedin-like domain 


2 .2e-66 


234 .0 


1078 


mito carr 


Mitochondrial carrier proteins 


le-42 


149.3 


1079 


WD4 0 


WD domain, G-beta repeat 


6 .2e-45 


162 .7 


1007 


START 


START domain 


1.5e-48 


174 .7 


1093 


DSPC 


Dual specificity phosphatase, • 
catalytic doma 


3 ,3e-63 


223 .4 


1094 


G^HPx 


Glutathione peroxidases 


9.6e-41 


148 .8 


1095 


DUF25 


Domain of unknown function 
DUF25 


2e-75 


264 .0 


1096 


.DUF25 


Domain of unknown function 
D0T25 


6e-75 


262.4 


1105 


Nitroreducta 
se 


Nitroreductase family 


1 .3e-l3 


58.6 


1106 


PTE 


Phosphotriesterase family 


1 . 3e-179 


610.1 


1107 


DAGKc 


Diacylglycerol kinase catalytic 
domain 


0. 00049 


19.6 


1109 


ras 


Ras family 


1 .3e-15 


40.7 


1115 


Arf Gap 


Putative GTP-ase activating 
protein for Arf 


9 .7e-47 


168.7 


1116 


HMG14_17 


HMG14 and HMG17 


4.4e-21 J 


83 .5 


1117 


HMG14 17 


HMG14 and HMG17 


9 . 9e-12 


52.4 


1119 


FAA_hy dr o 1 a s 
e 


Fumarylacetoacetate (FAA) 
hydrolase f am 


2e-83 


290.6 


1120 


pkinase 


Eukaryotic protein kinase 
domain 


1.4e-94 


327 .6 


| 1123 


abhydrolase 


alpha /beta hydrolase fold 


9.2e-23 


89.0 


1129 


pro_i s otne ras 
e 


Cyclophilin type peptidyl- 
prolyl cis-tr 


2.2e-56 


197.1 


1131 


DnaJ 


DnaJ domain 


1.6e-30 


114.9 


1132 


WD40 


WD domain, G-beta repeat 


1 ,3e-19 


78. 6 


1133 


WD40 


WD domain, G-beta repeat 


1.8e-15 


64 .9 


1134 


PU 


PH domain 


0.0015 


17.8 


1136 


Adap comp su 
b 


Adaptor complexes medium 
subunit family 


1.2e-256 


866.0 


1137 


Adap_comp_su 
b 


Adaptor complexes medium 
subunit family 


2.5e-209 


708 .8 


1139 


ras 


Ras family 


1 .5e-86 


301.0 


1141 


pkinase 


Eukaryotic protein kinase 
domain 


9.4e-74 

• 


258.4 


1152 


Acyl transfer 
ase 


Acyl transferase 


1.2e-05 


29-9 


1153 


IRS 


PTB domain (IRS-l.type) 


S.4e-S5 


196 . 1 


1155 




Immunoglobulin domain 


1.3e-31 


106.9 


1157 


Asparaginase 
_2 


Asparaginase 


6.4e-72 


252.3 


1159 


GMC_oxred 


GMC oxidoreductases 


4 . 7e-142 


485.3 


1160 


zf-ANl 


ANl-like Zinc finger 


0.00021 


27.9 
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SEQ ID 

NO: 


PFAM NAME 


DESCRIPTION 


p-value 


PFAM 
SCORE 


1163 


1 inker_Jiisto 
ne 


linker hi stone Hi and H5 family 


3.8e-14 


60 .4 


1164 


DED 


Death ettector domain 


3 .9e-05 


30 .5 


1165 


IRS 


PiB domain (IRS- 1 type) 


2.6e-43 


157.3 


1166 


IRS 


PTB domain ( IRS- l type) 


2.6e-43 


157.3 


1168 


SAM 


SAM domain I sterile alpha 
mot it) 


0.04 


10.5 




abhy dro 1 a s e 


alpha/beta hydrolase fold 


O.098 


-7.5 


1174 


SAP 


SAP domain 


3.9e-10 


47 .1 


1177 


PP2C 


Protein phosphatase 2C 


5.3e-31 


112 .5 


1178 


WD40 


WD domain, G-beta repeat 


4.7e-35 


129.9 


1180 


Ets 


Ets-domain 


1.8e-09 


33 .3 


1181 


Collagen 


Collagen triple helix repeat 
(20 copies) 


0. 00016 


24 .7 


1182 


TCL1_MTCP1 


TCL1/MTCP1 family 


9.5e-56 | 198.6 


1184 


RasGBF 


RasGEF domain 


1.7e-88 


307 .4 


1185 


mitq_carr 


Mitochondrial carrier proteins 


1 .5e-62 


217 .3 


1187 


UPAR_LY6 


u-PAR/Ly- 6 domain 


0 . 0042 


15 . 6 


1188 


0 rn_D AP_Arg_ 
dec 


Pyr i doxal - dependent 
decarboxylase 


6 .2e-128 


430 . 6 


1193 


Stathmin 


Stathmin family 


1 .8e-90 


314 . 0 


1194 


Stathmin 


Stathmin family 


1 . 8e-90 


314 . 0 


1195 


Seel 


Seel family 


3 .2e-183 


622.1 


1196 


pyr_redox 


Pyridine nucleotide- disulphide 
oxidoreducta 


3 .le-32 


111 . 8 


1197 


Glyco_transf 
_8 


Glycosyl transferase family 8 


1.2e-09 


45 .5 


1202 


K_tetra 


K+ channel tetramerisation 
domain 


0.022 

• 


-16.8 


1203 


adn_short 


short chain dehydrogenase 


8.3e-45 


162.3 


1206 


Ubie_methylt 
ran 


ubiE/COQ5 methyl transferase 
family 


1.3e-121 


417 .4 


1208 


7tm_3 


7 transmembrane receptor 


7.2e-09 


29.0 


1209 


ank 


Ank repeat 


3 .9e-15 


63 .7 


1210 


VATP- 
synt_AC3 9 


ATP synthase (C/AC39) subunit 


2 .5e-128 


439.7 


1212 


zf -C2H2 


Zinc finger, C2H2 type 


5.5e-17 


69.9 


1213 


ef hand 


EF hand 


3 .2e-07 


37.4 


1219 


rrra 


RNA recognition motif. 


2 .le-40 


147.7 


1220 


DUF6 


Integral membrane protein DUF6 


0.015 


21.5 


1222 


SCAN 


SCAN domain 


1.5e-71 


251.1 


1223 


G - gamma 


GGL» domain 


3.6e-36 


129.5 


1227 


catalase 


Catalase 


0 


1158 .9 


1232 


PX 


PX domain 


2.2e-15 


64.5 


1233 


PX 


PX domain 


2 .2e-15 


64 .5 


1236 


FCH 


Fes/ClP4* homology domain 


3 .3e-09 


44 .0 


1241 


Peptidase M2 
0 


Peptidase family M20/M25/M4 0 


2e-63 


224.1 


1243 


WW 


WW domain 


0.044 


17.9 


1247 


UPF0006 


Metalloenzyme of unknown 
function TJPF000G 


6 .3e-6l 


215.8 


1248 


Glycos trans 
£_2 


Glycosyl transferases 


4.5e-10 


46.9 


1249 


ef hand 


EF hand 


4e-il 


50.4 


1254 


UQ_con 


Ubiqui tin- conjugating enzyme 


2 .le-73 


257.3 


1255 


ras 


Ras family 


2.2e-62 


220.7 


1256 


formyl trans 
f 


Formyl transferase 


4 .Se-30 


108 .3 


1259 


zf-C3HC4 


Zinc finger, C3HC4 type (RING 
finger) 


5 .3e-13 


46.4 


1261 


DiHf olate re 
d 


Dihydrof olate reductase 


2.1e-69 


241.7 


1262 


G_glu_t ransp 
ept 


Gamma-glutamyl transpeptidase 


1.8e-110 


380.4 


1263 


PAS 


PAS domain 


1.3e-08 


36.9 


1265 


LRR 


Leucine Rich Repeat 


4.2e-22 | 


86.9 
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SEQ 1L> 
NO: 




DESCRI PTION 


o- value 


PFAM 
SCORE 


-■ -> 




<?CP-like extracellular Drotein 


6e-29 


108 .0 


1267 


K_tetra 


K+ channel tetramexisation 
domain 


2.8e-27 


104.0 


-L c. O -7 


X Ci c> 


Ras family 


1. 3e-85 


297 .9 






Zinc finger, C3HC4 type (RING 
finger) 


4 .2e-10 


37.0 


1276 


abhyxirola.se 


alpha/beta hydrolase fold 


5.4e-23 


89 . 8 


2. C 1 t 


abhvrirolase 


alpha/beta hydrolase fold 


5.6e-21 


83.1 


-l 9TQ 




Trypsin 


4 ,4e-41 


132.0 


X A O vl 


PHP 


phosphatidylethanolaraine- 
binding protein 


1.3e-13 


58.7 


128S 




Zinc finger, C3HC4 type (RING 
finger) 


5.6e-i4 


49.6 


1287 


ante 


Ank repeat 


1.7e-52 


187 .8 


1294 


fn3 


Fibronectin type III domain 


0.026 


20.9 


1295 


GBP 


Guanylate -binding protein 


0.00026 


-70 .0 






PMP-22/EMP/MP20/Claudln family 


6.9e-41 


149 .3 


1297 


Rhodanese 


Rhodanese -like domain 


3 .2e-14 


60.7 




T.TM 


lim domain containincr brotcins 


5. 8e-21 


79.1 


1301 


rnaseA 


Pancreatic ribonucleases 


4.9e-43 


145.2 


i i m 
liU / 


rnico carr 


1 JX LUcliUiiUi. xax uai i. acl f w^v**"'' 


2 . le-53 


186 . 0 


13 08 


WU4U 


WL/ QLHualIl r u~UCLa Lcpcav. 


1 . 6e-17 


71 . 6 


1310 


UPAR LY6 


\1 - r'MxC / i-»y ClCJUlclUl 


7 le-20 


75 . 5 


1313 


thiored 


Thioredoxin 


3 .6e-05 


21.6 


1314 


Aa_ trans 


Transmemnrane amino acia 
cranspoxcer proucin 


1 » Dc"D / 




1316 


trypsin 


iTrypsm 


4 4e-41 


132 . 0 


1320 


Riboeomal_I»l 
3 


Ribosomal protein 1*13 


3.9e-62 


219.8 


1327 


Armadil 1 o_s e 
9 


Armadil 1 o/be ta- catenin- 1 ike 
repeacs 


0.0054 


23.4 


1328 


KRAB 


KRAB box 


0.052 


j -5.6 


1329 


rrm 


bLPiA recognition mo.ii . 


2 le-40 


14 7.7 


1330 


Bel -2 


Apoptosis regulator proteins, 
i3CJ.~<d rairvuLy 


0.014 


-1.6 


13 31 


PX 


xr A QOutain 


2 ie-10 


48.0 


1333 


VIl 7VT1 

KKAB 


JV20U3 DUX 


1 8e-36 


134 . 6 


1334 


upi^syncneca 

DC 


rULaClVe UlJUcl-api t:iiy-l. 

1 «V»oert^lSil t" ^ avnf 
■iA ^ f 1111 1 wjfAi4« 


2 3e-89 


310 - 3 


•I "1 •> c 
1 J J D 


lipp «u-r>t»V»«»t- 
Ur r 5y llvllcbo 

BC 


Putat* i vp undp.caDreiivl 
di nhosnha t e svn t 


1 . 8e-59 


211.0 


j 1336 


DSPC 


Dual specificity phosphatase, 
catalvtic doma 


1 .2e-31 


118.6 


JL -J -J / 


DSPe 


Dual snecificitv phosphatase, 
catalytic doma 


2.3e-12 


54.5 


1338 


TPR 


TPR Domain 


0.00021 


28.1 


1340 


metal thlo 


Metal lothionein 


0.013 


20.3 


1341 


ttiUtT 


Bacterial mutT protein 


S.8e-09 


36.5 


1343 


Band 41 


FERM domain (Band 4.1 family) 


1 .3e-38 


1 122.5 


1344 


Kelch 


Kelch motif 


1 ,4e-44 


161.5 


1345 


Antifreeze 


Antifreeze protein 


1 .2e-10 


48.8 


1347 


3Beta HSD 


3 -beta hydroxysteroid 
dehydrogenase/ isomer a 


0 .086 


! -177.2 


1348 


BTB 


BTB/P0Z domain 


5.3e-28 


106.5 


1349 


DUF6 


Integral membrane protein DDF6 


0.033 


15.8 


1350 


myosin_head 


Myosin head (motor domain) 


i 0 


1088 .7 


1352 


Nramp 


Natural resistance -associated 
macrophage pro 


1.2e-202 


686.6 


1353 


S_100 


S-100/ICaBP type calcium 
binding domain 


5.3e-23 


89.9 


1355 


DEAD 


DKAD/DEAH box helicase 


3.6e-65 


209. 0 


1356 


C2 


C2 domain 


2.4e-15 


64 .4 


1357 


RBD 


Raf-like Ras -binding domain 


4 .2e-57 


203 .1 


1360 


zf -C2H2 


Zinc finger, C2H2 type 


7 .4e-141 


481.4 


1361 


HMG14 17 


HMG14 and HMG17 


7.9e-40 


145 . 7 
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SBQ ID 
NO: 


PFAM NAME 


DESCRIPTION 


p-value 


PFAM 
SCORE 


1362 


SIS 


SIS domain 


3.8e-30 


113 .6 


1363 


SIS 


SIS domain 


1.3e-28 


108 .5 


1364 




Immunoglobulin domain 


0.00026 


19.0 


1368 


K_tetra 


K+ channel tetramerisation 
domain 


l.le-16 


68.9 


1371 


Collagen 


Collagen triple helix repeat 
(20 copies) 


2.2e-113 


390 .1 


1372 


DnaJ 


DnaJ" domain 


6.6e-36 


132.7 


1376 


XRAB 


KRAB box 


2 .le-38 


141 .0 


1378 


EL»M2 


EU42 domain 


2e-23 


91.3 


13 80 


thiored 


Thioredoxin 


1.2e-23 


82 .8 


1381 


ank 


Ank repeat 


2.3e-83 


290.4 


1382 


BTB 


BTB/POZ domain 


3e-12 


50.8 


1383 


WD40 


WD domain, G-beta repeat 


1.6e-19 


78 .3 


1384 


WD40 


WD domain, G-beta repeat 


6 .3e-24 


92.9 


1387 


zf-C3HC4 


Zinc finger, C3HC4 type (RING 
finger) 


l.le-09 


35.6 


1389 


zf-C2H2 


Zinc f i nger , C2H2 type 


5.5e-50 


179.5 


1390 


zf-C2H2 


Zinc finger, C2H2 type 


2.5e-85 


296 .9 


1393 


kinesin 


Kinesin motor domain 


7.8e-188 


637.4 


1394 


zf-C2H2 


Zinc finger, C2H2 type 


1.2e-49 


178.4 


1398 


KRAB 


KRAB box 


5.1e-22 


86 .6 


1402 


bZIP 


bZIP transcription factor 


0.035 


13.1 


1405 


sugar_tr 


Sugar (and other) transporter 


0.003 


-101.5 


1406 


RhoGAP 


RhoGAP domain 


8 .9e-47 


168.8 


1407 


rrm 


RNA recognition motif . 


le-35 


132.1 j 


1408- 


LRR 


Leucine Rich Repeat 


2.1e-13 


58.0 


1409 


Nebulin repe 
at 


Nebulin repeat 


6e-54 


192.6 


1410 


ank 


Ank repeat 


l_6e-17 


71.6 - 


1412 


Ribosomal_L5 

_c 


ribosomal I*5P family C- terminus 


8 . 2e-58 


205.5 


1415 


trypsin 


Trypsin 


4 .7e-85 


.270.4 


1416 


aminotran_X 


Aminotransferases class- I 


4.4e-05 


-91.2 


1417 


SI 


SI RNA binding domain 


1. 6B-C7 


33.1 


1419 


WD4 0 


WD domain, G-beta repeat 


2 .2e-09 


44 .6 


1422 


cadherin 


Cadherin domain 


8 .3e~42 


152 .3 


1424 


SH3 


SH3 domain 


2 .5e-80 


280.3 


1425 


PHD 


PHD- finger 


3 .2e-17 


70.6 


1426 


PHD 


PHD-finger 


3 .2e-17 


70 .6 


1427 


ArfGap 


Putative GTP-ase activating 
protein for Arf 


le-37 


138.8 


1428 


helicase_C 


Helicases conserved C- terminal 
domain 


le-26 


102.2 


1429 


WD40 


WD domain, G-beta repeat 


3 . 9e-07 


37.2 


1430 


inositol_P 


Inositol monophosphatase family 


2 .5e-10 


40.2 


1431 


mito carr 


Mitochondrial carrier proteins 


4 .3e-83 


287.7 


1433 


Clq 


Clq domain 


2 .9e-16 


66.2 


1434 


WD40 


WD domain, G-beta repeat 


1 .6e-13 


58.3 


1435 


Inos-i- • 
P_synth 


Myo-inositol-l -phosphate 
synthase 


7e-228 


770.4 


1436 


rrm 


RNA recognition motif. 


1.4e-34 


128.3 


1438 




Immunoglobulin domain 


1 .3e-12 


45.6 


1440 


G_Adapt_CT 


Gamma -adapt in, C-terrainus 


3 .4e-67 


236.7 


1441 


G__Adapt_CT 


Gamma-adapt in, C- terminus 


3 .4e-67 


236.7 


1443 


Kelch 


Kelch motif 


0.00013 


28.7 


1446 


ARID 


ARID DNA binding domain 


1 .8e-21 


84 .7 


1447 


zf-C2H2 


Zinc finger, C2H2 type 


9 .4e-28 


105.6 


1448 


AMP-binding 


AMP-binding enzyme 


2 .6e-07 


-145.1 


1451 


rrm 


RNA recognition motif. 


6 .Se-21 


82.9 


1454 




Immunoglobulin domain 


5.6e-44 


146.7 


1455 


S ialy 1 transf 


Sialyl transf erase family 


5 .4e-21 


83 .2 


1460 


Al dos e_ep im 


Aldose 1-epimerase 


1 ,9e-3S 


131.2 


1461 


C2 


C2 domain 


4e-18 


73 .6 


1470 


TIG 


IPT/TIG domain 


3 .le-19 


77 .3 


1472 


Ps eudoU_synt 


RNA pseudouridylate synthase 


4.3e-16 


66.9 
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no * 


ns*M TsT7\ [SAG* 

rr API Einrlr, 


ncQrw T PTTON 
LJ C>o> * — Tv X tr x 1UH 


T\~\rz* "1 up 

Lff V CX _L U w 


I Irr /Ai 1 

SCORE 




h_2 








t a. no. 




DRNN (AEX-3) dcmain 


1 . 3e-44 


161 - 6 


1475 


Cation_ ef f lu 


Cation efflux family 


4 ,6e-49 


176 .4 


1477 


TBC 


TBC domain 


8e-47 


169 . 0 


1 A 7ft. 
X / o 


A. A. iu 


DMA recocmition motif. 


2e-21 


84 . 6 


1480 




Immunoglobulin domain 


5.5e-06 


24 .3 


X*±0*i 


lCXV UXUU CXJL 

pha 


TpI om^rp-binrfi no nrot pin alnha 

subuni 


0 028 


-225 9 








1 8e-68 


240 9 


i /ice 




domain 


9 Se-13 


49 9 


1 /I Q O 


ne n case t. 


n&iiCases consccveu terniinai. 
domain 






i /i n q 


uur a? 


rLU Lcln OH UJIKIiOWu XUDCLlon 
TVTFQ Q 




x 0 ^ . ** 


1 /l OA 




fcJJOyi ~ LOA uyuidbaSc/ ISOQiBCOSC 

^ a m "\ T vr 
X CLH LX X jr 




X "4 7 • / 


X H -> X 


guanyictuc y 
c 


"&^f*nv/l afp snH f^nanvl ^t"^ pvrl acp 
nCtCiiyxoUC CtllU Uiiaiiyj.tiuc LyLiasc 

catalyt 




166 1 


1 A Q7 








77 2 


1495 


Zf-C3HC4 


Zinc finger, C3HC4 type (RING 

-C" M% ^» V^fc, ^> \ 


7. le-10 


36.3 




pjClilaSe 


UU ulct. JL 11 




fl ft 






OH J UUlua jLII, 




27 2 


1502 


homeobox 


Homeobox domain 


0 .084 


13 .8 


ibUJ 


nomeooox 


ti ome ojdox domain 




XO . O 






Lur -iiKc domain 




on n 


1506 


UCH-2 


Ubiquitin carboxyl- terminal 
Ayaroiase rami iy 


2 . 7e-21 


84 .2 


T mo 

1508 


Peptidase M2 

u 


Peptidase tamily M2u/M25/M40 




1U1 • 0 


i.JXi 


ny 


jcrA. uurnuxn 


X . J c X X 


J X ■ 


X .£ 


SUi IqL uoC 


OIXX Xatdoc 


2 8f»-35 


13 0 7 


1516 


Syntaxin 


Syntaxin 


0 . 011 


-62 .3 


. _ 


ouUIlO LCal] -J 


iiiuinucrdiio terdu es> ciusy ~ 111 


7 > « C X V/ O 


J V J • » 


1S20 

-X. ^> V 




Tmirnmnol ohul i ri rfriTria.iri 


0 075 


11 . 0 


1521 


RA 


Ras association (RalGDS/AF-6) 
domain 


0 .013 


13 .3 


1523 


RhoGAP 


RhoGAP domain 


2.5e-05 


18.7 


1S28 


WD4 0 


WD domain O— h^ta. rPTaeat* 


5 4e-24 


93 . 1 


153 5 


IMS 


imDB / mucB / s amB familv 


7 # 8e- 95 


328 .5 


1538 


FYVE 


FYVE zinc f inoer 


3 _ 2e-27 


101 . 5 


1539 


DAGKc 


Diacylglycerol kinase catalytic 
domain 


6e-07 


36.5 


1540 


Ocular* a i n 


Ocular albinism tvoe 1 o rote in 


0 


1184 .7 


1653 


SAP 


SAP domain 


6e-06 


33 .2 


1654 


Ami no_ox±da s 
e 


Flavin containing amine oxidase 


3.2e-43 


157.0 


1655 


Amino oxidas 
e 


Flavin containing amine oxidase 


3 ,2e-43 


157. 0 


1656 


RhoGEF 


RhoGEF domain j 


1.4e-24 


95.1 


1657 


MMR HSR1 


GTPase of unknown function 


0 .0011 


-45.5 


1659 


UCH-2 


Ubiquitin carboxyl- terminal 
hydrolase family 


2.5e-ll 


51.1 


1660 


act in 


Actin 


6.6e-21 


69.9 


1661 


BAH 


BAH domain 


1.7e-82 


287 .5 


1662 


vwa 


von Willebrand factor type A 
domain 


0 


1909.4 


1663 


WD40 


WD domain, G-beta repeat 


1.4e-67 


237.9 


1667 


zf-C2H2 


Zinc finger, C2H2 type 


1.3e-93 


324 .4 


1669 


NollJKop2_Su 

n 


NOIil /NOP2/ sun family 


1.3e-23 


84 .3 


| 1671 


SH2 


Src homology domain 2 


5.4e-l5 


46 . 9 
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SEQ ID 

NO: 


PFAM NAME 


DESCRIPTION 


p-value 


PFAM 
SCORE 


1672 


chxomo 


1 chromo ' (CHRromatin 
Organization Modifier) 


2.ie-i8 


67 . 7 


1674 


zf-CCCH 


Zinc finger C-x8-C-x5-C-x3-H 
type 


0.0025 


17.6 


1676 


G 1 yco_hydro__ 
47 


Glycosyl hydrolase family 4 7 


1.8e-187 


636.2 


1677 


Glyco_hydro_ 
47 


Glycosyl hydrolase family 47 


4.5e-74 


259.5 


1680 


WD40 


WD domain, G-beta repeat 


l.le-27 


105. S 


1681 


WD40 


WD domain, G-beta repeat 


l.le-27 


105.5 


1683 


MMR_HSR1 


GTPase of unknown function 


1.8e-78 


274 .1 


1621 


rrm 


RMA recognition motif. 


1 . 8e-37 


137.9 


1692 


rrm 


RNA recognition motif. 


1.8e-37 


137 . 9 


16S3 


AAA 


ATPaaes associated with various 
cellular act 


l.3e-8l 


284 . 5 


16S7 


Fe rr i c_r e du c 
t 


Ferric reductase like 
transmembrane com 


8 .4e-82 


285 .2 


1698 


Ferric_reduc 
t 


Ferric reductase like 
transmembrane com 


3.5e-53 


190.1 


1699 


zf -C2H2 


Zinc finger, C2H2 type 


4 .4e-34 


126 . 6 


1700 


arf 


ADP-ribosylation factor family 


9e-19 


75. 8 


1702 


GTP_EFTU 


Elongation factor Tu family 


0 . 014 


11 .4 


1703 


SCAN 


SCAN domain 


1.8e-54 


194 .4 


1707 


pkinaae 


Eukaryotic protein kinase 
domain 


1.2e-88 


307 .9 


1709 


WD4 0 


WD domain, G-beta repeat 


0. 0035 


24 .0 


1710 


LRR 


leucine Rich Repeat 


1.2e-30 


115.3 


1711 


WW 


WW domain 


7.6e-12 


52.8 


1712 


ank 


Ank repeat 


4 .2e-34 


126.7 


1713 


zf-CCCH 


Zinc finger C-x8-C-x5-C-x3-H 
type 


2 .6e-09 


38.3 


1714 


zf-CCCH 


Zinc finger C-x8-C-x5-C-x3-H 
type 


2 .6e-09 


38.3 


1715 


ras 


Ras family 


4 .4e-41 


149.9 


1718 


HMG_box 


HMG (high mobility group) box 


8 .3e-21 


82.6 


1719 


TBC 


TBC domain 


l.le-45 


165.2 


1721 


KLH 


Helix- loop- helix DNA- binding 
domain 


9 .2e-10 


45.9 


1723 


Qsrm 


Double- 3 tranded RNA binding 
motif 


2 .9e-05 


30.9 


1724 


RrnaAD 


Ribosomal RNA adenine 
dimethyl ases 


0.045 


9.2 


1725 


CIDE-N 


CIDE-N domain 


5 .9e-40 


146.2 


1726 


HAT 


HAT (Half -A-TPR) repeats |2.9e-44 


160.5 


1728 


ef hand 


EF hand 


5 .le-20 


79.9 


1733 


Hist deacety 
1 


Histone deacetylase family 


1 .7e-104 


360.6 


1735 


LRU 


Leucine Rich Repeat 


4 .6e-34 


12^.6 


1739 


Pl-PkC-X 


Phospha t i dy 1 inositol- spe c i f i c 
phosphol ipa s e 


0.0023 


16.1 


1743 


ras 


Ras family 


3 .7e-10 


-21.3 


1744 


ras 


Ras family 


3 .7e-10 


-21 . 3 


1745 


RasGEF 


RasGEF domain 


3 .2e-49 


176 . 9 


1746 


adh_short 


short chain dehydrogenase 


7 .le-08 


34 .6 


1751 


zf -C2H2 


Zinc finger, C2H2 type 


9e-39 


142. 2 


1754 


fn3 


Fibronectin type III domain 


5.5e-101 


348 . 9 


1756 


ZE-C2H2 


Zinc finger, C2H2 type 


6.3e-93 


*3 O 1 


1758 


rrm 


RNA recognition motif. 


0.017 


21.2 


1760 


Nop 


Putative snoRNA binding domain 


6.le-95 


328.8 


1761 


Nop 


Putative snoRNA binding domain 


6 .le-95 


328 .8 


1765 


MMR_HSR1 


GTPase of unknown function 


6.4e-41 


149.4 


1769 


CN_hy dr ol as e 


Carbon-nitrogen hydrolase 


3e-06 


-43.9 


1775 


ank 


Ank repeat 


4 .le-07 


37.1 


1779 


Oxysterol_BP 


Oxysterol- binding protein 


4 ,7e-56 


199.6 


1783 


RhoGEF | 


RhoGEF domain 


1 .6e-23 


91 .6 


1784 


RhoGEF 


RhoGEF domain 


1 .6e-23 


91.6 
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SEQ IP 
NO: 


PFAM NAME 


DESCRIPTION 


p- value 


PFAM 
SCORE 


1785 


rrm 


RNA recognition rootif . 


6.4e-14 


59.7 



TRADOCS : 1 4 1 6227. 1 (%CRN0 1 L DOC) 
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TABLE 5 



SEQ ID NO: 


POSITION OF 
SIGNAL; IN AMINO 


MaxS (MAXIMUM 
SCORE) 


MeanS (MEAN 
SCORE) 


.a 


X — A ■!■ 


u - y y x 






i" Jl 


ft qo>; 


n O A yl 

u . y 


■a 






U . (ib 




1 _ T Q 




ft OCT 


c 


1 .Of 
.1 — ^0 


U .7 /1 


n 0 f 1 


o 




V.^rl 


O . o63 


■7 


X— 


n on 


0 . 8 63 


0 
a 


X - « 0 


U . j / X 


u . 863 


Q 


X — ^ © 


ft qo 

u . y 0 x. 


U . 90X 


Xv 




n qoi 

U . J7l 


0 . 955 




J. «J 


O D Q Q 

v . y 0 y 


ft O ft ft 


X A 


X — ^ J 


A OCC 

u . y i>i 


ft Oft*) 

0 . 8 03 


-1. O 


1*1(1 




ft ^ ~\ 

0 .625 


X4 


l-lo 


0 .93 8 


0 . 876 


15 


1-25 


0 - 941 


0.811 


16 


1-17 


0 . 972 


0 . 939 


17 


1-27 


0 . 964 


0.777 


18 


1-16 


0.914 


0 .657 


19 


1-19 


0 .953 


0.840 


20 


1-20 


0 .935 


0 .701 


21 


1-22 


0.974 


0.850 


22 


1-33 


0 .961 


0.895 


23 


1-19 


0 .991 


0.959 


24 


1-31 


0.995 


0.944 


25 


1-22 


0.976 


0.935 


26 


1-27 


0.996 


0.928 


27 


1-24 


0.953 


0.739 


28 


1-21 


0.906 


0.688 


29 


1-31 


0. 986 


0.841 


30 


1-28 


0. 980 


0.893 


31 


1-19 


0.993 


0.976 


32 


1-22 


0.998 


0.909 


35 


1-33 


0.949 


0-736 


36 


1-33 


0 . 949 


0.736 


46 


1-19 


0. 570 


0 .951 


67 


« v— 

1-25 


0.968 


0.848 


71 


1-18 


0.949 


0 . 845 


72 


1-3 0 


0 . 991 


0.919 


75 


1-29 


0 . 958 


0.854 


So 


1-20 


0 . 986 


0.945 


94 


1-33 


ft. n ✓ 

0 . 994 


0 . 943 


17 / 


X — 4 b 


ft r\ r~ y 


ft r ft r 

0 . 595 


1 ft"* 


X-4 9 


u . y 0 j 


□ . 570 


xuo 


1-zb 


O.9/O 


0 . 885 


ITT 
XXX 




n q a 0 
U . 9o9 


ri a 0 o 
0 . 899 


1 9C 


X — x j 


u . yr>r> 


ft 0 m 


lOQ 


x— 1 9 


n at i i 


ft mo 
U - 9Xo 




X — 4. y 


O . 9 /X 


ft 0 >i >i 
0 . 844 


1 ATI 


1-1© 


0 . 914 


ft ^* *^ 0 

0 . 628 




T ft 

1-20 


0 . 969 


0 .904 


ice 
lob 


1-25 


0 . 941 


0 .811 


-L- »J O 


1 _ 9 

X — 


u - s» /y 


O . 9*S / 


160 


1-17 


0.972 


0.939 


161 


1-48 


0.903 


0.571 


162 


1-25 


0.937 


0.729 


168 


1-16 


0.939 


0.826 j 


171 


1-27 


0.964 


0.777 


178 


1-21 


0.945 


0.825 


180 


1-27 ~ " j 


0.981 


0.941 


187 


1-28 


0.982 


0.936 


190 


1-19 


0.953 


0.840 


196 


1-22 


0.975 ! 


0. 916 


197 


1-22 


0.963 


0.936 
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SEO ID NO- 


POSITION OF 
SIGNAL IN AMINO 
ACID SEQUENCE 


MaxS (MAXIMUM 
SCORE) 


MeanS (MEAN 
SCORE) 




1-20 


0.935 


0. 701 


200 


1-23 


0.977 


0 .773 


206 1 


1-30 


0.984 


0.890 


207 


1-19 


0 . 990 


0.924 


208 


1-22 


0.974 


0.850 


210 


1-40 


0.94 0 


0 .670 


211 


1-28 


0 .971 


0.849 


216 


1-24 


0.986 


0 .956 


218 


1-33 


0.961 


0 .895 


219 


1-19 


0.970 


0.871 


221 


1-19 


0.904 


0.553 


222 


1-21 


0.917 


0.5S5 


230 


1-19 


0 .991 


0.959 


231 


1-26 


0.953 


0.800 


232 


1-25 


0.988 


0 .826 


239 j 


1-23 


0.969 


0.828 


240 


1-17 


0 .982 


0.9S5 


241 


1-17 


0 .982 


0 .955 


245 


1-30 


0 .970 


0 .722 


248 


1-22 


0.976 - 


0.935 


249 


1-23 


0.968 


0 . 94 0 


252 


1-18 


0.971 


0 .923 


261 


1-24 


0 .883 


0 .587 


265 


1-18 


0.939 


0.868 


272 


1-24 


0 .953 


0 .739 


283 


1-21 


0 .906 


0 .688 


284 


1-29 


0.997 


0.854 


290 


1-31 


0.986 


0.841 


302 


1-28 


0.980 


0 .893 


304 


1-16 


0 .907 


0 .635 


312 


1-19 


0.993 


0.976 


313 


1-17 


0 .930 


0.753 


323 


1-22 


0.998 


0.909 


324 


1-17 


0.982 


0 .954 


328 


1-19 


0 .971 


0.865 


329 


1-22 


0.963 


0.924 j 


330 


1-33 


0.978 


0.841 


331 


1-24 


O.S20 


0 . 712 


332 


1-24 


0 .975 


0 .881 


333 


. 1-19 


0 .584 


0.941 


334 


1-20 


0.899 


0.567 


335 


1-27 


0 .942 


0.813 


336 


1-20 


0.952 


0 . 850 


337 


1-38 


0.942 


0.653 


338 


1-27 


0.973 


0.772 


339 


1-36 


0.979 


0.804 


340 


1-27 


0. 888 


0.597 


343 


1-19 


0 . 971 


0 . 865 


344 


1-22 


0 .994 


0.928 


345 


1-17 


0.966 


0 .687 


346 


1-19 


0.936 


0.822 


347 


1-22 


0.963 


0 .924 


349 


1-24 


0.982 


0.966 


351 


1-21 


0.918 


0.815 


352 


1-31 


0.988 


0 .912 


354 


1-31 


0.974 


0.839 


3SS 


1-29 


0.932 


0.632 


356 


1-15 


0.994 


0.969 


357 


1-33 


0.935 


0.726 


360 


1-27 


0.938 


0.827 


361 


1-25 


0.954 


0.674 


362 


1-22 


0.929 


0 .788 


3 63 


1-21 


0.881 


0.715 


364 


1-33 


0.978 


0.841 


365 


1-33 


0.978 


0.841 
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SBQ ID NO: 


POSITION OF 
SIGNAL TN AMINO 
ACID SEQUENCE 


MaxS (MAXIMUM 
SCORE) 


MeanS (MEAN 
SCORE) 


ioo 


1-21 


O . 316 


0 . 820 


Jo / 


1-19 


O . 93 6 


0 . 822 


ICQ 


2-23 


U . S» 72 


0 - 874 


inn 
J /u 


J. - Z4 


n q*> r» 
U . j^U 


0 . 712 


1*71 
J / -t 


1 OA 
± - Jt * 




D . / / J 


no 


A — £ / 


u . yiy 


rk t c o 
0 . 768 


j i j 


1 _ 1 Q 


U . JOO 


U . 34 i> 






O QQ A 




! J / O 


X — J % 


U.JO/ 


U . olU 




A - X / 


n Q Q c 


0 . S50 


■3 /O 


J. - 43 


U . 3 / 1 


0 . 743 




1 -2U 


n • a £ a 
u . 36 8 


0 . 874 


JO J. 


1-20 


0 . 928 


0.782 


382 


1-19 


0 .986 


0 . 934 




1-28 


0 . 965 


0 . 829 


3 o4 


1-39 


0 . 970 


0 . 551 


386 


1-24 


0 .975 


0 . 881 


388 


1-30 


0.989 


0 . 868 


389 


1-19 


0 .984 


0 . 941 


390 


1-26 


0 .971 


0 . 782 


392 


1-20 


0 .981 


0 . 900 


393 


1-16 


0 . 968 


0 . 890 


394 


1-23 


0 .937 


0 .701 


397 


1-22 


0.985 


0 . 854 


399 


1-46 


0.977 


0 . 698 


401 


1-20 


0 .899 


0 .567 


402 


1-22 


0.967 


0 . 931 


403 


1-27 


0.992 


0 .934 


404 


1-19 


0.991 


0 .973 


405 


1-23 


0.994 


0 .921 


407 


1-35 


0.987 


0.658 


408 


1-39 


0.976 


0.551 


409 


1-33 


0.897 


0.570 


410 


1-25 


0.990 


0 .962 


411 


1-38 


6.977 


0 . 827 


412 


1-20 


0.944 


0 . 768 


413 


1-20 


0. 988 


0 . 965 


414 


1-46 


0.993 


0 .638 


415 


1-23 


0.981 


0 . 940 


417 


i-lsi 


0. 941 


0 . 672 


418 


1-20 


0 .952 


0.850 


419 


1-19 


0. 986 


0 . 967 


420 


1-29 


0.965 . 


0.861 


421 


1-22 


0 - 889 


s\ nor 

0*785 


422 


1-4 8 


0 .382 


0 . 862 


424 


-1 i n 

1-13 




0 - 333 


vj oo 
4«o 


1-38 


0 . 342 


0 . 653 


ion 


1-18 


0 . 947 


0 . 595 




1-33 


r\ OCT 

0 . 957 


f\ TOO 

0.783 


A "1 


1-26 


0 . 979 


0 - 304 


A "5 y» 


1-27 


0 . 962 


O . 777 


/l ^ c 

435 


1-24 


0 . 998 


0 . 977 


436 


1-27 


0 . 973 


0 . 772 


443 


1-1S 


0 . 966 


0 . 940 


448 


1-36 


0 .979 


0 . 804 


453 


1-41 


0 . 958 


0 . 603 


455 


1-33 


0.943 


0.606 


457 


1-27 


0.888 * 


0.597 


462 


1-16 


0.925 


0.681 


486 


1-27 


0.972 


0. 845 


495 


1-24 


0.917 


0 .636 


498 


1-26 


0.993 


0 .690 


505 


1-20 


0.976 


0.926 


507 


1-17 


0.966 


0. 687 


510 | 1-23 


0.930 


0.593 
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SEQ ID NO: 


POSITION OF 
SIGNAL IK AMINO 
ACID SEQUENCE 


MaxS (MAXIMUM 
SCORE) 


MeanS (MEAN 
SCORE) 


[ 511 


1-23 


O.930 


0.593 


512 


1-23 


0.930 


0. 593 


S15 


1-18 


0.978 


0 . 956 


523 _j 


1-19 


0.936 


O.S22 


529 


1-22 


0.963 


0 .924 


54 5 


1-24 


0.982 


0 .966 


550 


1-30 


0.933 


0.713 | 


552 


1-21 


0.973 


0 .912 


554 


1-23 


0.969 


0.784 


571 


1-21 


0.918 


0 . 815 


574 


1-31 


0.988 


0 . 912 


580 


1-39 


0 .525 


0.556 


594 


1-31 


0.974 


0.839 


608 


1-29 


0.932 


0 . 632 


609 


1-29 


0.532 


0 . 632 


610 


1-21 


0 .990 


0 . 948 


621 


1-15 


0 .594 


0.969 


623 


1-33 


0.935 


O. 726 


653 


1-27 


0 .938 


0 . 827 


668 


1-22 


0.929 


0. 788 


677 


1-16 


0.948 


0. 807 


685 


1-21 


0 .881 


0.715 


699 


1-22 


0.975 


0. 816 


702 


1-31 


0.968 


0. 898 


707 


1-16 


0.8B0 


0.562 


713 


1-25 


0.966 


0.743 


718 


1-19 


0 .936 


0 .822 


719 


1-20 


0 .961 


0.824 


729 


1-29 


0.972 


0.874 


735 


1-46 


0.903 


0 .598 


746 


1-14 , 


0.916 


0.73 0 


747 


1-22 


0.965 


0.876 


748 


1-29 


0.968 


0 . 785 


759 


1-24 


0.961 


0.773 


767 


1-27 


0.919 


0 .768 


768 


1-33 


0.900 


0.585 


773 


1-42 


0.959 


0.702 


779 


1-19 


0.986 


0.945 


797 


1-19 


0.944 


0.759 


798 


1-19 


0 .900 


0.568 


820 


1-17 


0.995 


0. 950 


| 827 


1-49 


0.971 


0.749 


84 8 


1-20 


0.968 


0.874 


864 


1-20 


0.928 


0.782 


866 


1-19 


0.986 


0. 934 


873 


1-23 


; 0.948 


0 . 886 


881 


1-28 


0.965 


0 . 829 


887 


1-39 


6.970 


0 . 551 


927 


1-30 


0.98S 


0 . 868 


934 


1-48 


0 . 988 


0 . 777 


939 


1-39 


0.994 


0 . 889 


944 


1-26 


0.971 


0 . 782 


950 


1-29 


0.957 


0 . 845 


963 


! 1-20 


0 . 981 


0 . 900 


964 


1 1-20 


0 .886 


0 . 558 


973 


1-16 


0 . 968 


0 . 890 


980 


1-34 


0.961 


0.749 


981 


1-20 


0.953 


0 . 822 


984 


1-12 


0.938 


0.780 


1015 


1-22 


0.985 


0. 854 


1040 


1-46 


0. 977 


0. 698 


1052 


1-18 


0.969 


0. 842 


1059 


1-20 


0 .927 


0 . 867 


1065 


1-33 


0.983 


0.918 


1069 


1-22 


j 0.993 


0.935 
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SEQ ID NO: 


POSITION OF 
SIGNAL IN AMINO 
ACID SEQUENCE 


MaxS {MAXIMUM 
SCORE) 


J MeanS (MEAN 
SCORE) 


1075 


1-27 


0-992 


( 0.934 


1080 


1-19 


0 .931 


J 0.829 


1092 


1-19 


0.991 


j 0.973 


1094 


1-46 


0.992 


j 0.653 


1095 


1-30 


0. 974 


J 0.929 


1105 


1-23 


| 0.994 


J 0.921 


1123 


1-35 


0.987 


j 0.658 


1138 


1-32 


0.954 


J^0 .613 


1140 


1-33 


0.989 


~|"0.789 


1142 


1-33 


0.897 


J 0.570 


1152 


1-25 


0.990 


j 0.962 


1170 


1-3B 


0.977 


J^O.827 ~~ 


1176 


1-20 


0.944 


]~0.768 


1187 


1-20 


0.988 


L0.965 


1189 


1-35 


0.967 


(0.839 


1192 


"1-46 


0.993 


j 0.638 


1193 


1-16 


0.925 


0.710 


1197 


1-29 


0 .985 | 0 .853 


1208 


1-23 


0.981 


| 0.940 


1225 


1-29 


0.941 


j 0.672 


1245 


1-19 


0.986 


j 0.967 


1258 


1-29 


0.965 


1 0.861 


1265 


1-22 


0.889 


I 0.785 


1266 


1-20 


0 .944 


[ 0 .809 


1276 


1-48 


0.982 


\ 0. 862 


1292 


1-19 


0 .979 


| 0.933 


1296 


1-21 


0 .984 


| 0.944 


1297 


1-19 


0.984 


| 0.953 


1332 


1-38 


0.942 


| 0.6S3 


1358 


1-18 


0 .947 


j 0.595 


1371 


1-33 


0.957 


j 0.789 


1380 


1-26 


0.979 


1 0.904 


1397 


1-27 


0.962 


0.777 


1399 


1-23 


0.997 | 0.960 


1404 


1-24 


0.998 j 


0.977 


1410 


1-15 


0 . 946 j 


0.845 


1414 


1-24 


0.913 j 


0 . 588 


1415 


1-19 


0.982 [ 


0.929 


1416 


1-12 


0.931 


0.891 


1418 


1-30 


0.933 j 


0.563 


1420 


1-20 


0.881 | 


0.561 


1421 


1-19 


0.990 [ 


0. 968 


1423 


1-17 


0.968 f 


0.863 


1424 


1-21 


0.885 j 


0.591 


1425 


1-24 


0.913 | 


0.588 


1426 


1-24 


0.913 j 


0.588 


1428 


1-25 


0.957 ~ | 


0.899 


1430 


1-34 . ; 


0.977 


0.819 


1431 


1-28 


0 . 979 1 


0.923 


1432 


1-36 


0.957 [ 


0.613 


1433 


1-32 


0.921 f 


0.753 


1434 


1-39 


0.983 j 


0. 621 


1435 


1-25 


0 . 910 


0 .631 


1436 


1-42 


0.988 


0 . 868 


1437 


1-22 


0.998 


0 . 980 


1442 


1-20 


0.918 


0 . 753 


1448 


1-12 


0.931 


0.891 


1462 


1-18 


0.968 


0.888 | 


1490 


1-20 


0.881 


0.561 


1518 


1-17 


0 .968 


0 . 863 


1525 


1-21 


0.885 


0.591 


1547 


1-23 


0.974 


0 .891 


1561 


1-25 


0.967 


0.899 


1580 


1-17 


0.923 


0 .824 


1593 


1-28 


0.979 j 


0.923 
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SBQ ID NO: 


POSITION OF 
SIGNAL* IN AMINO 
ACID SEQUENCE 


MaxS (MAXIMUM 
SCORE) 


MeanS (MEAN 

£j L-wrCC, J 


1596 


1-16 


n Q*> o 

U . 


0 709 


1601 


1-36 


U . 1*3 / 


O 613 


1606 


1-22 


/\ Q7Q 


v • o _> J- 


1607 


1-20 


u » u / *k 


O 770 


icnn 

lOUO 


1-32 


0.921 


0 .753 


1614 


1-33 


0.969 


0.829 


1616 2 


1-20 


0.959 


0 .869 


1625 


1-39 


0.983 


0 .621 


163 2 


1-25 


0.910 


0 .631 


1636 


1-33 


0.897 


0.591 


1639 


1-42 


0.988 


0.868 


1645 


1-20 


0.927 


0.568 


164 7 


1-17 


0 .923 


0 .742 


1648 


1-22 


0.998 


0.980 



TRADOCS: 14 1 6234.1 (%CR%01 ! .DOC) 
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TABLE 6 



SEQ ID NO: 


SEQ ID 


SEQ ID NO: 


SEQ ID 


Priority 


i SEQ ID 


of full- 


NO: of 


of con tig 


NO: 


docket number_ 


NO: in 


length 


full- 


nucleotide 


of contig 


correspond i ng 


U.S. S.N. 


nucleotide 


length 


sequence 


peptide 


SEQ ID NO: in 


09/488,725 


sequence 


peptide 
sequence 




sequence 


priority 
application 




1 


1787 


3573 


5359 


784CIP2_1 


1103 


2 


1788 


3574 


5360 


784CIP2_2 


2673 


3 


1789 


357.5 


5361 


784CIP2_3 


4117 


4 


1790 


3576 


5362 


784CIP2_4 


5556 


S 


1791 


3577 


5363 


784CIP2_5 


5562 


6 


1792 


3578 


5364 


784CIP2_6 


5562 


7 


1793 


3579 


5365 


784CIP2 7 


5562 


8 


1794 


3580 


5366 


784CIP2_8 


5562 


9 


1795 


3581 


5367 


784CIP2_9 


5563 


| 10 


1796 


3582 


5368 


784CIP2_10 


5564 


11 


1797 


3583 


5369 


784CIP2_11 


5565 


12 


1798 


3584 


5370 


784CIP2_12 


5689 


13 


1799 


3585 


5371 


784CIP2_13 


5729 


14 


1800 


3586 


5372 


784CIP2_14 


5745 


15 


1801 


3587 


5373 


784CIP2_15 


5777 


16 


1802 


3588 


5374 


784CIP2_16 


5777 


17 


1803 


3589 


5375 


784CIP2_17 


5789 


18 


1804 


3590 


5376 


784CIP2_18 


5792 


19 


1805 


3591 


5377 


784CIP2_19 


5804 


20 


1806 


3592 


5378 


784CIP2_20 


5805 


21 


1807 


3593 


5379 


784CIP2_21 


5805 


22 


1808 


3594 


5380 


784CIP2_22 


5844 


23 


1809 


3595 


5381 


784CIP2_23 


5844 i 


24 


1810 


3596 


5382 


784CIP2_24 


5850 


25 


1811 


3597 


5383 


784CIP2_25 


5867 


26 


1812 


3598 


5384 


784CIP2_26 


5973 


27 


1813 


3599 


5385 


784CIP2_27 


5995 


28 


1814 


3600 


5386 


784CIP2_28 


5995 


29 


1815 


3601 


5387 


784CIP2 29 


6005 


30 


1816 


3602 


5388 


784CIP2_30 


6007 


31 


1817" 


3603 


5389 


784CIP2_31 


6007 


32 


1818 


3604 


5390 


784CIP2_32 


6009 


33 


1819 


3605 


53S1 


784CIP2 33 


6012 


34 


1820 


3606 


5392 


7 34CIP2_34 


6015 


35 


1821 


3607 


5393 


784CIP2_35 


6016 


36 


1822 


3608 


5394 


784CIP2_36 


6016 


37 


1823 


3609 


5395 


7B4CIP2_37 


6018 


| 38 


1824 


3610 


5396 


7B4CIP2_38 


6018 


39 


1825 


3611 


5397 


784CIP2_39 


6018 


40 


1826 


3612 


5398 


7B4CIP2 40 


6023 I 


41 


1827 


3613 


5399 


784CIP2_41 


60/0 


42 


1828 


3614 


5400 


784CIP2_42 


6081 


43 


1829 


3615 


5401 


784CIP2_43 


6089 


44 


1830 


3616 


5402 


784CIP2 44 


6118 




1831 


3617 


5403 


784CIP2_45 


6118 


46 


1832 


3618 


5404 


784CIP2_46 


613 0 


47 


1833 


3619 


5405 


784CIP2_4 7 


6177 


48 


1834 


3620 


5406 


784CIP2_48 


6189 


49 


1835 


3621 


54 07 


784CIP2 49 


6191 


50 


1836 


3622 


5408 


784CIP2 50 


6204 


51 


1837 


3623 


540S 


784CIP2_51 • 


6204 


52 


1838 


3624 


5410 


784CIP2_52 


6284 


53 


1839 


3625 


5411 


784CIP2_53 


6367 


54 


1840 


3626 


5412 


784CIP2_54 


6436 


55 


1841 


3627 


5413 


784CIP2 55 


6442 


56 


1842 


3628 


5414 


784CIP2_56 


6445 


57 


1843 


3629 


5415 


784CIP2_57 


6457 


58 


1844 


3630 


S416 


784CIP2_S8 


6458 


59 


1845 


3631 


5417 


784CIP2_59 


6458 
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SEQ ID NO: 
of full- 
length 
nucleotide 
sequence 


SEQ ID 
NO: of 
full- 
length 
peptide 

^» « w% j^i 


SEQ ID NO: 
of contig 
nucleotide 
sequence 


SEQ ID 
NO : 

o£ contig 

pcpL luE 


Priority 
aOCJCcc n unic)er_ 
cor rc bpomj-L ny 
CEO ID NO* in 
r>r*ioiri tV 

^* ^ J 

application 


SEQ 1L> 

V4KJ - XXI 

n <Z <5 N 
09/488,725 


60 


Xo4 o 


3632 


5418 


784CIP2_60 


6462 


61 




3633 


54 19 


784CIP2_61 


6472 


62 




3634 


5420 


784CIP2 62 


6499 


63 


i a >i a 

JL Oft J? 


3635 


5421 


784CIP2_63 


6499 


64 


losu 


3636 

.J u *j u 


5422 


784CIP2_64 


6505 


65 


T 0 CI 


3637 


5423 


784CIP2__65 


6534 


66 


i a c - *5 
looZ 


-J O J O 


5^-24 


784CIP2 66 


6534 


67 


J. 05J 


363 9 
~> & ~j j 


5425 


784CIP2_67 


6540 


68 




**£4 o 


5426 


784CIP2_68 


6550 


69 






54 27 


784CIP2 69 


6550 


70 


i o tr f 
1 B5b 




542 8 


7B4CIP2 70 


6592 


71 


n Q c •? 

loo / 


"iC4 1 


5429 


784CIP2 73 


6645 


72 




.5 O *i *i 




784CIP2 72 


6671 


73 


1959 




c 4 3 "L 
j j a 


784CIP2 73 


6763 


74 


m q r n 

loo 0 


J Ol O 


5432 


784CIP2 74 


6763 


75 


1361 


J b fi / 


9iJJ 


784CIP2 75 "~ 


6786 


76 


1862 


•If /ID 


c.4 3 4 


784CIP2 76 


6824 


77 


1863 


3b^3 




7B4CIP2 77 


6830 


78 


1864 


JobO 


OtI J O 


"*"~*~ 784CIP2 78 


6831 


79 


1865 


3651 


34 J 1 


784CIP2 79 


6832 


80 


1866 


3652 






6834 


81 


1867 


3653 






6834 


82 


1858 


3654 




TR4.nP2 8"7 


6835 


83 


1859 


3655 


04t^l 




6837 


84 


1870 


3656 


t A A O 

04 4 Z 




6843 


85 


1871 


3657 


5-443 


704PTD7 Q c 


6859 ~ 


86 


[ 1872 


3658 


tZ A A f 

i>44^ 


1 D f± V — 1 1 Z o o 


6915 


87 


1873 


3659 




7PirTD7 R*7 


6932 


88 


1874 


3660 


r— <• ^ 
544 D 




6957 


89 


1875 


3661 


CA A 1 


1 R4f*T R Q ~ 


6961 


90 


1876 


3662 


ca a a 


7R4TTP? QA 


6973 


91 


1877 


3 o63 




7R4C7P2 91 


6973 


92 


1878 


ICC/1 

3 b o4 


_>** JU 


784CIP2 93 


7007 


93 


1879 


3 bob 


CI 
O 1 * O X 


784CIP2 94 


7018 


94 


1880 


1CCC 




784CIP2 95 


7019 


95 


1881 


jot)/ 




784CIP2 96 " 


7020 


96 


1882 


1CC9 




784CIP2 97 


7020 


97 


1 o u 'J 

18 83 


j> boy 




784CIP2 98 


7021 


98 


1 G Q A 

lo o4 


JO / U 




784CIP2 99 


7023 


99 


1 o o c 


JO / J. 


5457 


784CIP2 100 


7027 


100 


i a a C 
looo 


JO / 4 




784CIP2 101 


7028 


101 


a q at 
loo / 


JO / J 


-V ^ — * — 7 


784CIP2 102 


7029 


102 


looo 


J o / ft 


5460 


7B4CIP2 103 


7031 


103 


F i goo 




5461 


784CIP2 104 


7032 


X.SJ'k 


*~ 1 ft OA 


3676 


5462 


784CIP2 105 


7033 






1677 


5463 


784CIP2_106 


T" 7035 


JLUD 


T RQ9 

JL u »7 X 


3678 


5464 


784CIP2_107 


7036 


1 A7 


*A> O J J 


3679 


5465 


784CIP2_108 


7039 


1 HA 
X u o 


1 ROA 
A w J ■* 


3680 

•h/ V W 


5466 


784CIP2_109 


7043 


1 AQ 


1 ROC 


3681 


5467 


784CIP2_110 


7044 


Tin 


1896 

A O J V 


3(j 82 

V V (M 


5468 


784CIP2_111 


7046 


111 


JLO J f 


3683 


S4 69 


784CIP2_112 


7054 


112 


1898 


3684 


54 70 


784CIP2_113 


7061 


113 


1899 


3685 


5471 


784CIP2_114 


7077 


114 


1900 


3686 


S472 


784CIP2 115 


7092 


115 


1901 


3687 


54 73 


7B4CIP2_116 


7094 


116 


1902 


3688 


5474 


784CIP2__117 


7106 


117 


1903 


3689 


5475 


784CIP2 118 


7107 


118 


1904 


3690 


5476 


784CIP2 119 


7111 


119 


1905 


3691 


5477 • 


784CIP2 120 


7123 


120 


1906 


3692 


5478 


784CIP2_121 


r 7142 


121 


1907 


3693 


5479 


784CJP2 122 


7142 
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SEQ ID NO: 


SEQ ID 


SEQ ID NO: 


SEQ ID 


Priority 


SEQ ID 


_ fr r\,7 1 

or iuii- 


NO: of 


of contig 


NO : 


docket number_ 


NO : in 


J. tuiy uxi 


tull- 


nucleotide 


of contig 


corresponding 


U.S. S.N. 


nnrl pnh i <4p 

1 * V x» X C V l. 1UC 


1 o Y\n¥> \s 

leocjcn 


sequence 


pcpuiae 


3£>u iu »v/ ; in 


09/488, 725 


seoupncp 


£7 *~h* ^~ 1C**S 






pi ion cy 


t 




seoupnce 






arvT^l i rat" i #*vn 




122 


1908 


3694 


5480 


784CIP2 123 


» X 34 


123 


1909 


3695 


5481 


784CIP2 124 


/ X O V 


124 


1910 


3696 


5482 


7 84CIP2 125 


'ID? 


125 


1911 


3697 


5483 


784CIP2 126 


# X O Z3 


126 


1912 


3698 


5484 


784CIP2 127 


71 97 
< x 7 / 


127 


1913 


3699 


5485 


704CIP2 128 


7219 


1 128 


1914 


3700 


5486 


784CIP2 129 


/ ^ Z D 


1 129 


1915 


3701 


5487 


784CIP2 130 


7229 


130 


1916 


3702 


5488 


784CIP2 131 


72^4 

/ 3 4 


131 


1917 


3703 

■mm* W V W 


5489 


;oi\.J.r< xj« 


/ -A J 3 


132 


1918 


3704 


c 4 90 


7 fl 4 fT P7 1 ~k 


77 ii c 


133 


1919 


37G5 


54 91 




TO 1 ft 
/ x 3 O 


134 


1920 


3706 

^ /wo 


R4 

34 7 j£ 


/ D4Ulr£ 133 


*7 O A "7 
/<i4 / 


135 

J- — / — > 


1 921 


3 / V » 


3433 


/o4\_lb'2 I30 


00 ^ n 1 


116 


1 977 




C A OA 
3434 


/04^1cr2 13 ^ 


/2d2 


137 


7 9 2"* 

X7£J 


3 / U 3 


c:4 a c 

34 j 3 


/o4tlr^ lJO 


00 0 


X J o 


j- ^ *± 


ni n 

3 /1U 


CA O £ 
34 y O 


/04L1W 13 3 


oo 0 0 


139 


X 3^3 


3/11 


34 3 ' 


/o4^11r2 14U 


0 0 ^ 
/2 / 3 


■Xt v 


7 Q7£ 
1 3 <5 O 


3 / 1Z 


C/l Q 0 

3*i3o 


/o4Clr2 141 


7282 


X4 X 


7 977 
13& / 


3 /13 


rz a 0 0 
34 33 


"7 O A T OO "1 A O 

7o4ullr2 14 2 


72 88 




1 3^ o 


3/14 




/t)4Lir^ 14 J 


1 O O m 
/2 31 


X4 •? 




J / lb 


5501 


784CIP2 14 4 


7293 


14 4 


1 J J U 


3 /lo 


c c n i 


784 LIF2 14 3 


7294 


14 3 


1 273 1 


3 /I / 


350 3 


7B4Clr2 l4o 


7299 


f lie 


T O 
133A 


3 /lo 


33L/4 


7o4Clr2 14 / 


*f O 

73 0O 


TAT 


l y j j 


3 719 


rr r? r\ rr 

5505 


784CIP2 148 


7312 


IAS 


1934 


3 /2u 


5506 


784CIP2_149 


73 13 


1 A O 

14 y 


i 1 a i c 
1333 


3/21 


5507 


784CIP2 150 


7315 


1DU 


"1 O 1 C 
l33 O 


inn 


c c ** a 


7H4CIP2 131 


7318 




T O T O 

133 / 


3 /23 


550 9 


784CIP2 152 


7321 


T C"5 


133 O 


3 /-s4 


5510 


784CIP2 153 


733 0 


1 


1333 


■3 /^3 


3311 


/04v.lr2_134 


OI "7 1 


1 C.4 


S qa n 

174 U 


3 //io 


3312 


O O yl f~* ICC 

/S4L1 i'2_ - l 3 5 


/ 3 3 3 


1 CC 
133 


"1 OA "1 
134 1 


3 / 


33l3 


1 04 V. 1 F2 15 b 


/350 


13D 


"t Q it O 


Q "7*5 O 
•9 / «i O 


3314 


/o4Llr2 15 / 


OO c 0 
/352 


13 / 


1343 


3/^3 


551 5 1 


O O /l T D *5 1 CD 

/o4^1Jr2 13D 


/3 H4 


1 C.ft 
13 O 


1 QA A 


"t "7"* nf 

3 /3U 


33lO 


TQji #"T T>"5 "ICO 
/ Q4V-.1 Ir2__l3y 


/4UJ 


X33 


1 OAR 
134 3 


J / 31 


cci O 

331 / 


/a4LJlr6 1DU 


#431 


160 


1 94 6 


•9 / 3^ 


33 JL. O 


7ft4r*TP7 Ifil 


74 A 1 

# 4 i X 


^ W -L 


1947 






/ O'lUXr £ Xu^ 


' TO J 


162 

V/ 4* 


194 B 


3734 


*J -J \J 


784CIP2 163 


74 6 7 


163 

•b W 


1949 

J^ ^ » — ' 






7S4CIP2 164 


7471 


164 


1950 


3 736 


5522 


784CIP2 165 


7493 


165 


1951 


3 737 


5523 


784CIP2 166 


7S02 


166 


1952 

mm mm mmJ mm 


3738 


5524 


784CIP2 167 


7511 

r «ir v» 


167 


19S3 


3739 


5525 

mj +J A\m «J 


784CIP2 168 


7514 


168 


1954 


3740 


5526 


784CIP2 169 


7520 


169 


1955 


3741 


5527 


784CIP2 170 


7541 


170 




374° 




7ft4r*TP*> 1 71 
/ ul<wirx x / -L 


7 570 


171 


19S7 


374, "-i 
J / 3 


33^3 


7R4r*TP2 1 7"? 
/ oiv^irx 1/^ 


7S7fli 


172 


1 9SR 

X 3 3 O 


J / 44 


3 3 J u 


/OftLlrx 1/3 


/ 3 O 3 


173 


1 959 

-L. J — > 


"4 "7 A R 
J / 4 3 


33 Jl 


OQAf^TPO T "74 
/o4Llr^ 1 /4 


/ 3 3 


174 


1960 


3746 


5532 


784CIP2 175 


7601 


175 


1961 


3747 


5533 


784CIP2_176 


7602 


176 


1962 


3748 


5534 


784CIP2_177 


7608 


177 


1963 


3749 


5535 


784CIP2_178 


7615 


178 


1964 | 


3750 


5536 


784CIP2_179 


7617 


179 


1965 j 


3751 


5537 


784CIP2 181 


7624 


180 


1966 


3752 


5538 


7B4CIP2_182 


7626 


1B1 


1967 


3753 


5539 


784CIP2 183 


7640 


182 


1968 


3754 


5540 


7B4CIP2 184 


7641 


183 


1969 1 


3755 


5541 


784CIP2 185 


7641 
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SEQ ID NO: 
of full- 
length 
nucleotide 
sequence 


SEQ ID 
NO : of 
full- 
length 
peptide 
sequence 


SEQ ID NO: 
of contig 
nucleotide 
sequence 


SEQ ID 
NO: 

of contig 

peptide 

sequence 


Priority 
docket number^ 
corresponding 
SEQ ID NO: in 
priority 
appi.icat.xoa 


.SEQ ID 
NO: in 
U-S.S .N- 


184 


1970 


3756 


554 2 


784CIP2_186 


7641 


185 


1971 


3757 


5S43 


784CIP2 187 


7642 


186 


1972 


3758 


5544 


7 84CIP2_188 


7649 


187 


1973 


3759 


5545 


784CIP2 189 


7656 


- 188 


1974 ^ 


3760 j 


554 6 


784CIP2_190 


7657 


[ 189 


1975 


3761 ! 


5547 


7 84CIP2_191 


7657 


j 190 


1976 


3762 


5548 


784CIP2_192 


7662 


[" 191 


1977 


3763 


5549 


784CIP2_193 


7668 


I 192 


1978 


3764 


5550 


784CIP2 194 


7673 


1 193 _j 


1979 


3765 


5551 


784CIP2_195 | /b9U 


[ 194 


1980 


3766 


5552 


784CIP2 196 7700 


! 195 


198X 


3767 


5553 


784CIP2 197 : 7709 


j 196 


1982 


3768 


5554 


784CIP2_198 


/ /Jt> 


[ 197 


1983 


3769 


5555 


784C1P2 199 


7737 


1 198 


1984 


3770 


5556 


784CIP2 200 


7744 


j 199 


1985 


3771 


5557 


784CIP2 201 


7771 


| 200 


1986 


3772 


5558 


784CIP2 202 


7786 


( 201 


1987 


3773 


5559 


784CIP2 203 


7791 


1 202 


1988 


3774 


5560 


784CIP2_204 


7797 


|~ 203 


1989 


3775 


5561 


784CIP2_205 


7806 


1 . 204 


1990 


3776 


5562 


784CIP2_206 


7812 


[ 205 


1991 


3777 


5563 


784CTP2_207 


7812 


| 206 


1992 


3778 


5564 


784CIP2_208 


7818 


j 207 


1993 


3779 


5565 


784CIP2_209 


7822 


j 208 


1994 


3780 


5566 


7 84CIP2_210 


7827 

f\ ~% /"\ 


209 


1995 


3781 


5567 


784CIP2_211 


783 0 

n t*% ~% r— 


j 210 


1995 


37B2 


5568 


784CIP2_212 


7835 

nn i A 


j 211 


1997 


3783 


5569 


784CIP2_214 


784 0 


[ 212 


1998 


3784 


5570 


784CIP2__215 


^ a c q 

/b JO 


1 213 


1999 


3785 


5571 


784CIP2 21o 




j 214 


2000 


3786 


5572 


784CIP2 2, 1 / 


/ CO X 


J 215 


2001 


3787 


5573 


7 84CIP^ <d JL o 


7flfi 6 

/ o u u 


j 216 


2002 


3788 


5574 


7 84 CI P2 


Tftfi ft 
/ o o o 


r 217 


2003 


3789 


5575 


784CIP2 2.Z 0 


7QQ {T 


| 218 


2004 


3790 


5576 


784CIP2 221 


7898 


j 219 


2005 


3791 


5577 


784CIP2_222 


j 7900 


j 220 


2006 


3792 


5578 


784CIP2_223 


7906 


t 221 


2007 


3793 


5579 


784CIP2_224 


7908 


| 222 


2008 


3 794 


5580 


784CIP2_225 


7909 


j 223 


2009 


3795 


5581 


784CIP2_226 


7917 


i 224 


2010 


1 3796 


5582 


784CIP2_227 


7932 


| 225 


2011 


3797 


5S83 


784CIP2_228 


794 0 


j 226 


2012 


3798 


5584 


784CIP2 229 


794 0 


j 227 


2013 


3799 


5585 


784CIP2~2 30 


7984 


|~ 228 


2014 


3800 


5586 


784CIP2_231 


7984 


| 229 


2015 


3801 


5587 


784CIP2_232 


~ 8001 


1 230 


2016 


3802 


5588 


784CIP2 233 


8021 


[ 231 


2017 


3803 




784CIP2_234 


8029 


j 232 


2018 


3804 


5590 


7B4CIP2_235 


8033 


f 233 


2019 


3805 


5591 


784CIP2_23 6 


8040 


1 234 


2020 


3806 


5592 


784CIP2_237 


8052 


|~ 235 


2021 


3807 


5593 


784CIP2_238 


8096 






3S 08 


S594 


784CIP2_239 


8096 


j 237 


2023 


3809 


5595 


784CIP2_240 


8113 


j 238 


2024 


3810 


5596 


784CIP2 241 


8126 


| 239 


2025 


3811 


5597 


784CIP2_242 


8132 


J 240 


2026 


3812 


5598 


784CIP2 243 


8137 


j 241 


2027 


3813 


5599 


784CIP2_244 


8137 


j 242 


2028 


3814 


| 5600 


784CIP2_245 


8159 


| 243 


2029 


3815 


5501 


784CIP2 246 


8159 


I 244 


2030 


3816 


5602 


784CIP2 247 


8161 


[ 245 


2031 


3817 


5603 


784CIP2 248 


8176 



274 



01S3312A1 I > 



WO 01/53312 



PCT/USOO/34263 



SEQ ID NO: 


SEQ ID 


SEQ ID NO: 


j SEQ ID 


Priority 


SEQ ID 


Of full- 


NO: of 


of con tig 


NO: 


docket number_ 


NO: in 


length 


full- 


nucleotide 


of contig 


| corresponding 


U.S. S.N. 


nucl e o t i de 


length 


sequence 


peptide 


SEQ ID NO: in 


09/488, 725 


sequence 


peptide 
sequence 




sequence 


- • - 

priority 
application 






Z u j z 


~i o n 0 


C C ft A 


784CIP2 249 


6 1 96 


"3 /l *7 
Z^i / 


Z Uj J 




ccnc 
_ fc> U 3 


784CIP2 250 


8200 






Toon 
J oz u 


crnf 
3D ub 


784CIP2 251 


8212 

• 




« Uj 3 


J OZ X 


3bU f 


784CIP2 252 


1 -v ft 

i 8220 




£ v* O O 




C £ ft Q 
DowO 


/ 84CIP2 253 


] 823 8 






J04J 




/o4v,lt*2 254 


| 8254 


ZD/ 




JO£n 




/84CIP2 ^255 


{ 8255 


TCI 
z 0 J 




J OZ O 


3o A X 


/B4tIP2 256 


0000 
0J00 




^ ft .fl ft 
£U4U 


1 DOC 
3 OZ O 


3bl£ 


784CIP2_257 


8296 


0 c c 


Z 1 


1 Q O "7 

382 / 


5c 13 


7 84CIP2 258 


8329 


zoo 


Z U4 Z 


Jo 2o 


c el -\ a 
562 4 


784CIP2 259 


83 62 




2 04 3 


3829 


5615 


784CIP2_260 


84 2 9 


258 


2044 


A A A A, 

3830 


5616 


784CIP2 261 


8436 * 


255 


204 5 


3831 


5617 


784CIP2_262 


8448 


260 


2046 


3832 


5618 


784CIP2_263 


8472 


261 


204 7 


3833 


5619 


784CIP2_264 


8502 


262 


2048 


3834 


5620 


784CIP2_265 


8504 


263 


2049 


3835 


5621 


784CIP2_266 


8507 


264 


2 050 


3836 


5622 


784CIP2_268 


8509 


265 


2051 


3837 


5623 


784CIP2_269 


8515 


266 


2052 


3838 


5624 


784CIP2__270 


8519 


267 


2053 


3839 


5625 


784CIP2_271 


8530 


268 


2054 


3840 


5626 


784CIP2_272 


8532 


269 


2055 


3841 


5627 


784CIP2_273 


853 2 


270 


2056 


3842 


5628 


784CIP2_274 


8539 


j 271 


2057 


3843 


5629 


784CIP2_275 


\ 8541 


272 


2058 


3844 


5630 


784CIP2_276 


854 3 


273 


2059 


3845 


5631 


784CIP2_277 


8593 


274 


2060 


3846 


5632 


784CIP2_278 


8595 


275 


2061 


3847 


5633 


784CIP2_279 


8615 


276 


2062 


3848 


5634 


784CIP2 280 


8620 


277 


2063 


3849 


5635 


784CIP2__281 


8621 


278 


2064 


3850 


5636~ 


784CIP2_282 


8623 


279 


2065 


3851 


5637 


764CIP2_283 


8625 


280 


2066 


3852 


5638 


784CIP2_284 


8628 


i 281 


2067 


3853 


5639 


784CIP2_285 


8628 


282 


2068 


3854 


5640 


784CIP2_286 


8629 


283 


2069 


3 855 


5641 


784CIP2_287 


8630 


RAJ 

284 


2070 


3 856 


5642 


784CIP2_288 


8631 


285 

_ . . _ 


2071 


•a Q C 1 

3857 


5643 


784CIP2 289 


8633 


■hi n ^™ 

286 


2072 


3858 


5644 


784CIP2_290 


8634 


287 


2073 


3859 


5645 


784CIP2_291 


8635 


288 


2074 


— » A A 

3860 


5546 


784CIP2_292 


Q C -1 H 1 

8636 


289 


2075 


3861 


d647 


784CIP2_293 


8659 


290 


2076 


3862 


~ c a p 

a54 8 


784CIP2_294 


Q C C ft 

866O 


291 


2077 


3 863 


5649 


784CIP2_295 


866 / 




0 ft *■* 0 
2078 


"1 O C A 

3 864 


c ft 
5650 


784CIP2 296 


OOO / 


293 


1 ft *"J Cl 

2079 


•a 0 ^ c 

3865 


5651 


784CIP2 297 


8oo5 


294 


2080 


3866 


5652 


784CIP2_298 


Q O ft C 

ao05 | 


295 


2081 


a a ^ t 

3 867 


5653 


784CIP2_299 


8896 


296 


2082 


3868 


5654 


784CIP2_300 


8978 


297 


2083 


3869 


5655 


784CIP2_301 


ft f\ At A*~ 

9046 


290 


*> ft 0 ^ 
2U8£ 


■a 0 ~> ft 


5656 


7o4C_li'Z 3U 2 


on a a 


299 


2085 


3871 


5657 


784CIP2 303 


9116 | 


300 


2086 


3872 


5658 


784CIP2_304 


9195 


301 


2087 


3873 


5659 


784CIP2_305 


9201 


302 


2088 


3874 


5660 


784CIP2__306 


9307 


303 


2089 


387S 


5661 


784CIP2_307 


9321 j 


304 


2090 


3876 


5662 


784CIP2 308 


9397 


305 


2091 


3877 


5663 


7 84CIP2__309 


9405 


306 


2092 


3878 


5664 


784CIP2_310 


9406 


307 


2093 


3879 


5665 


784C1P2 311 


9422 
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cfo TD NO • 

of full- 
length 
nucleotide 
sequence 


SEO ID 
NO: of 
full- 
length * 
peptide 
sequence 


SEQ ID NO: 

of contig 

nucleotide 

sequence 


SEQ ID 
NO: 

of contig 

peptide 

sequence 


Priority 
docket numbex_ 
corresponding 
SEQ ID NO: in 
priority 
application 


SEQ ID 
NO: in 
U-S.S .N. 
09/488,725 


308 


2094 


3880 


5666 


784CIP2__312 


9494 


309 


2095 


3881 


5667 


784CIP2_313 


9512 


310 


2096 


3882 


5668 


784CIP2 314 


9632 


311 


2097 


3883 


5669 


784CIP2__31S 


9661 


312 


2098 


3884 


5670 


784CIP2_316 9&&4 


313 


2099 


3885 


5671 


784CIP2_317 ! 9691 


314 


2100 


3886 


5672 


• 784CIP2_318 


3 /OO 


315 


2101 


3887 


5673 


784CIP2 319 


9716 


316 


2102 


3888 


5674 


784CIP2_320 


9721 


317 


2103 


3889 


5675 


784CIP2 321 


9870 


318 


2104 


3890 


5676 


784CIP2 322 


9887 


319 


2105 


3891 


5677 


784CIP2 323 


9923 


320 


2106 


3892 


5678 


784CIP2 324 


9938 


321 


2107 


3893 


5679 


784CIP2_325 


9964 


322 


2108 


3894 


S680 


784CIP2_326 


10007 


323 


2109 


3895 


5681 


784CIP2 327 


10009 


324 


2110 


3896 


5682 


784CIP2 328 


10046 


325 


2111 


3897 


5683 


784CIP2 329 


10156 


326 


2112 


3898 


5684 


784CIP2_330 


10276 


327 


2113 


3899 


5685 


784CIP2 331 


10283 


328 


2114 


3900 


5686 


784CIP2B_1 


152 


329 


2115 


3901 


5687 


784CIP2B_2 


167 


330 


2116 


3902 


5688 


784CIP2B_3 


205 


331 


2117 


3903 


5689 


784CIP2B 4 


210 


332 


2118 


3904 


5690 


~~ 784CIP2B_5 


225 


333 


2119 


3905 


5691 


784CIP2B 6 


226 


334 


2120 


3906 


5692 


7B4CIP2B_J7 


264 


335 


20.21 


3907 


5693 


784CIP2B 8 


268 


f 336 


2122 


3908 


5694 


784CIP2B 9 


293 


337 


2123 


3909 


5695 


784CIP2B_10 


293 


338 


23.24 


3910 


5696 


784CIP2B__11 


293 


339 


2125 


3911 


5697 


784CIP2B_12 


302 


340 


2126 


3912 


5698 


7B4CIP2B_13 


311 


341 


2127 


3913 


5699 


784CIP2B_14 


352 


342 


2128 


3914 


5700 


784CIP2B_15 


358 

-*m ^— n 


343 


2129 


3915 


5701 


7B4CIP2B_16 


368 


344 


2130 


3916 


5702 


784CIP2B 17 


3 93 j 


345 


2131 


3917 


5703 


784CIP2B_18 


477 


346 


2132 


3918 


5704 


784CIP2B_19 


508 


347 


2133 


3919 


5705 


784CIP2B_20 


508 


348 


2134 


3920 


5706 


/ oh V- ±tr^.a 


515 


349 


2135 


3921 


5707 


784CIP2B 22 


578 


350 


2136 


3922 


5708 


784CIP2B_23 


588 


351 


2137 


3923 


5709 


784CIP2B_24 


591 


352 


2138 


3924 


5710 


784CIP2B_25 


593 


353 


2139 


3925 


5711 


784CIP2B_26 


594 


354 


2140 


3926 


5712 


784CIP2B 27 


619 


355 


2141 


j 3927 


5713 


784CIP2B_2B 


620 


356 


2142 


392B 


5714 


784CIP2B_29 


654 


357 


2143 


3929 


5715 


784CIP2B 30 


692 


358 


2144 


3930 


5716 


784CIP2B_31 


753 


359 


2145 


3931 


5717 


784CIP2B_32 


758 


360 


2146 


3932 


5718 


784CIP2B_33 


787 


361 


2147 


3933 


5719 


784CIP2B 34 


833 


362 


2148 


3934 


5720 


784CIP2B_35 


838 


363 


2149 


3935 


5721 


784CIP2B_36 


870 


364 


2150 


3936 


5722 


7B4CIP2B__37 


891 


365 


I 2151 


3937 


5723 


7B4CIP2B 38 


891 


366 


2152 


3938 


5724 


784CIP2B__39 


921 


367 


2153 


3939 


5725 


784CIP2B_40 


924 


368 


2154 


3940 


5726 


784CIP2B 41 


932 


369 


2155 


3941 


5727 


784CIP2B 42 


942 
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SEQ ID NO: 
or rull- 
xeng cn 

UUU J.CUU1UC 


SEQ ID 
NO: of 
tuil- 
i. eng in 


SEQ ID NO: 
or contxg 
nuc 1 eo t i d e 


SEQ XD 
NO: 

or contig 
peptide 


Priority 
docket number^ 
corresponding 

SEQ ID NO: in 

priority 


SEQ ID 

NO: in 

O . S . S .N . 

nn //i oo *7*>C 
09/400, /4£3 


370 


2156 


3942 


5728 


784CIP2B 43 


958 


371 


2157 


3943 


5729 


784CIP2B 44 


96 8 


372 


2158 


3944 


573 0 


784CIP2B 45 


992 


373 


21S9 


3945 


5731 


784CIP2B 46 


1025 ' 


374 


2160 


3946 


5732 


784CIP2B 47 


1074 


375 




2161 


3947 


5733 


784CIP2B 48 


1104 


376 


2162 


3948 


5734 


784CIP2B 49 


1 114 


377 

^+ 9 ¥ 


2163 


3949 


5735 


784CIP2B 50 


1144 


378 


2164 


3950 


5736 

mm* m *J 


784CIP2B 51 


1262 

X 4^ V> 


379 


2165 

*T* J»- W -mJ 


3951 


5737 


784CTP2B 52 


1 3 1 H 

X_> X o 


380 

v V 


2 166 
<i x o o 




5738 


7ft4r*TP2R 5"* 

/ D ** w JL o J3 3 3 


X 3 X 3 


381 


i 916"/ 


3 953 - 


573 9 


7fidPTP5R ^4 


1 3 2 R 


3 o 4l 


71 fifl 


3 954 

•7 J 3** 


574 n 


/ CiHr^D 33 


X 4 3 O 


J o 3 


0 1 fiQ 

A XD3 


3 955 


574 1 


/ O f» V L xr^O 3D 


±*» O 4 


TO/ 
J O *± 


2170 


3956 


*?"74 7 
3 / 1 ^ 


/OHUlr^o 3/ 


1 CQ/l 

13b4 


3 A 5 


2171 


3957 




/O^Llr^O 3tJ 


X O X / 


3 Rfi 


2172 


3958 




TOJlPTDTD CO 
/ O^Ul Jr <i D 3 7 


X / 


3 m 

JO/ 


2173 


3959 


Q74 C 


/ O 4 V*X f^Jb o u 


1 / Z O . 


3ftfl 


2174 


3960 


D / *i b 


/Oflti tr<Ci3_0 X 


T 777 
X / / Z. 




2175 


3961 


C7V1 7 
3 /4 i 


/ o4Llr/o o j£ 


T Q ft Q 

x o u y 


ion 


2176 


3962 


3 /4o 


/o4L.Xr'2l9 03 


T Q c a 
X O D O 




2177 


3963 


5/49 


7o4CA rZB o4 


i d a q 


33tf£ 


2178 


3964 


C7CA 


784C1P2B 65 


1926 


J y 3 


2179 | 3965 


! 5/51 


7B4CIP2B > _6 6 


x y b b 


T Q/l 


2180 


3966 


C7CT 


/a4L.Ir2B o/ 


T O C 7 


t nr 


2181 


3967 


5753 


784CxP2B 6 8 


1 O A C 

1995 


39o 


2182 


3968 


5754 


784CIP2B 69 


ft ft c 

2005 . 


^ 07 


2183 


3969 


5755 


784CIP2B 70 


O ft O T 


J?Q 


2184 


3 970 


3 /30 


784CiP2B /i 


once 

ZU33 


T Q Q 

o yy 


2185 


3971 


3 /5 / 


/H4LIP2B /2 


zlOJ 


a n ft. 
4UU 


2186 


3972 


3 /bo 


7 84CIP2B 7 J 


O T ft iT 




2187 


3973 j 


3 /33 


/ o4 C- J. P<£fcJ / 4 


OT /C C 




2188 


3974 


3 /OU 


/o4LlP2B /3 


OTIC 
Zl / 3 




2189 


3 975 


C "7 ^ 1 
3 /61 


7o4Cli/2o / o 


«S x / b 


>1 A/i 

4 04 


2190 


3976 


C T •> 


784C1P2B 78 


O O T. C 




2191 


3977 


3 /o3 


/o4Llr^o / 3 


O O C A 




2192 


3978 


3 / D *t 


TOAPTDOQ OA 

/ o 4 x— x r/.o D U 


0"* OA 
£3UU 


A a~7 


2193 


3979 


* 3 / D3> 


/04LlrZD OX 


7777 


** uo 


2194 


3980 


c "7 £T ! 
3 / OO 


7ftAPTB7B HO 


73 4 n 


A HQ 


2195 


3981 


3 / O / 




23 71 


M X U 


2196 


3982 


3 / O O 


7H4r*TP7R fl4 


23 99 




2197 


3983 






2411 


412 


2198 


3984 


5770 


784CIP2B 86 


2428 


413 


2199 


3985 


5771 


7B4CIP2B 87 


2430 


414 


2200 


3986 


5772 


784CIP2B 88 


2439 


415 


2201 


3987 


5773 


784CIP2B 89 


2447 


416 


2202 


3988 


5774 


784CIP2B 90 


2461 


417 


2203 


3989 


5775 

^ 9 9 J 


784CIP2B 91 


2487 


4 18 


2204 


3990 


5776 

3 r / D 


7R4CTP2B 92 


24 92 




2205 


3991 


3 » / / 


7ft4r*TP7Tl Q3 


2512 

Jw — J JW 


t 4on 

[ *i <S U 


2206 


3992 


3 / / a 


78drTT)7T> 04 


2564 


471 
* A X 


2207 


3993 [ 


3 / / 5* 


*7ft4<^T T>*>T1; Qt; 
/ O ** ^» i lr 4& jl> 33 


267B 

mlm \J / \J 


422 


2208 


3994 


5780 


784CIP2B 96 


2816 


423 


2209 


3995 


5781 


784CIP2B__97 


2818 


424 


2210 


3996 


5782 


784CIP2B_98 


2819 


425 


2211 


3997 


5783 


784CIP2B_99 


2943 


426 


2212 


3998 


5784 j 


784CIP2B_100 


3137 


427 


2213 


3999 


5785 


784CIP2B 101 


3137 


428 


2214 


4000 


5786 


784CIP2B_102 


3160 


429 


2215 


4001 


5787 


784CIP2B 103 


3323 


430 


2216 


4002 


5788 


784CIP2B_104 


3360 


431 


2217 


4003 


5789 


784CIP2B 105 


33 62 
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SEQ ID NO: 
of full- 
length 
nucleotide 
sequence 


SBQ ID 
NO: Of 
full- 
length 
peptide 
sequence 


SEQ ID NO: 
of contig 
nucleotide 
sequence 


SEQ ID 

NO: 

of contig 

peptide 

sequence 


Priori ty 
docket number^ 
corresponding 
SEQ ID NO: in 
pri or i ty 
app iicati oa 


SEQ ID 
NO: in 
U.S.S .N. 
o y/ 4t)o, /zb 


432 


2218 




CTOft 




3417 


433 


2219 


j rt AC 

4005 






3418 1 


434 


2220 


4O06 


d / y 


/ o *f v- JL ir ■<£ o .x.\i o 


3442 

mJ ~X X. 


435 


2221 


4007 


t> / y j 


no a f»T D9Q 1 QQ 


3442 


436 


2222 


Jt rt r% o 


rnni 

d / yi 


/ OS > — t r'-O 


3444 


437 


2223 


Jt rt *\ rt 
4009 


r- -» £\ l~ 


/ o4 Lxr^D JL JLX 


3855 


r~ 436 


2224 


4010 


5796 




J O O 


439 


2225 


M\ frt *% 

4011 


57 y 7 




4 090 


440 


2226 


4012 


b/Uo 


/04l.lr£0 -L J. *i 


4 105 


441 


2227 . 


4013 


5799 


/o^t-J.r'jitJ XX j 


** A."* A 


442 


2228 


4014 


C A rt rt 

5800 


/ t>4 ^.J»irZ£> JL J. D 


41 AT 


443 


2229 


4015 


t" O rt ^ 

5801 


/o4Llr£D JLX / 


A14Q 


444 


2230 


4016 


f— rt rt *x 

5802 


7o4v Li 7 /io lib 


A 1 QC 


445 


2231 


4017 


5803 


7o4CIr<iD liy 




446 


2232 


4018 


5804 


7840-llr^o 


*i ^ / 1 


447 


2233 


4019 


5805 


TO * ^» T "m Q TOT 

784L1 r'ZU J.^ X 


•1 J u ** 


448 


2234 


4020 


5806 


784CIP2B XZ4 




449 


2235 


4021 


5607 


784CIP2B_123 


4 J 11 


450 


2236 


4 022 


5808 


784CIP2B_124 


4Jzl 


451 


2237 


4023 


5809 


784CIP2B_125 


4 J ^3 { 


452 


2238 


4 024 


5810 


784CIP2B 126 


4 J J A 


453 


2239 


4025 


5811 


7B4CIP2B_127 


4488 


454 


2240 


4026 


5812 


784CIP2B_128 


4588 


455 


2241 


4027 


5813 


784CIP2B_129 


5569 


456 


2242 


4028 


5814 


784CIP2B_130 


5573 


457 


2243 


4029 


5815 


784CIP2B_131 


5577 j 


458 


2244 


4030 


5816 


7B4CIP2B_132 


5579 


459 


2245 


4031 


5817 


784CIP2B_133 


5582 


460 


2246 


4032 


5818 


784CIP2B_JL34 


5583 


461 


2247 


4033 


5819 


784CIP2B_135 


5584 


462 


2248 


4034 


5820 


784CIP2B_136 j 5585 . 


463 


2249 


4035 


5821 


784CIP2B_137 


5591 


464 


2250 


4036 


5822 


784CIP2B_138 


5593 


465 


2251 


4037 


5823 


784CIP2B_139 


5594 ( 


466 


2252 


4038 


5824 


784CIP2B_140 


5594 


467 


2253 


4039 


5825 


784CIP2B 141 


5598 


468 


2254 


4040 


5826 


784CIP2B_142 


5602 


469 


2255 


4041 


5827 


784CIP2B_143 


5605 


470 


2256 


4 042 


r» rt rt rt 

j 5828 


784CIP2B 144 


5608 


471 


2257 


4043 


5829 


784CIP2B_145 


5617 


472 


2258 


m rt M Jt 

4044 


5830 


784CIP2B_146 


5620' 


473 


2259 


4045 


5831 


784CIP2B_147 


5622 


474 


2260 


4046 


5B3.2 


784CIP2B_14 8 


5623 


475 


2261 


4047 


con 
5833 


784CIP2B_149 


5624 


476 


2262 


4048 


5834 


784CIP2B_150 


5625 


477 


2263 


A rt 

4049 


ro-j r 

DO Jb 


784CIP2B_151 


^ 5627 


478 


2264 


A rt V? f\ 

4050 


C Q T C 

boio 


784CIP2B_152 


5628 


479 


2265 


4 Obi 


C Q T 1 


784CIP2B_153 


5S30 


480 


2266 


j» rt CO 

4052 


C O 1 Q 


784CIP2B_154 


5632 


481 


2267 


4053 


boj y 


784CIP2B_1S5 


5640 


482 


2268 


4054 


5o4U 


7 84CIP2B 156 


5641 


483 


2269 


4055 


5841 


784CIP2B 157 


5643 


AO A 
40*i 




1U jv 


5842 

—J "X Am 


7 84CIP2B_158 


564 7 


485 


2271 


4057 


5843 


7 84CIP2B_159 


5649 


486 


| 2272 


4058 


5844 


784CIP2B_160 


5658 


487 


2273 


4059 


5845 


784CIP2B_161 


5659 


488 


2274 


4060 


5846 


7 84CIP2B_162 


5667 


489 


2275 


4061 


5847 


784CIP2B__163 


5672 


490 


2276 


4062 


5848 


784CIP2B 164 


5674 


_491 


2277 


4063 


5849 


7 84CIP2B_165 


5678 


492 


2278 


4064 


5850 


784CIP2B_166 


5680 


493 


2279 


4065 


5851 


784CIP2B 167 


5684 
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T*^^ TT\ 1lT^\ 

SEQ ID NO: 


SEQ ID 


SEQ ID NO : 


SEQ ID 


Priority 


SEQ ID 


OI ZUJLX~* 




OI COuLly 


wti : 


aocKet number 


NO : in 




Ftil 1 - 




ox. conciy 


corre s^oxzqj. ng 


U . £> . 5> . K . 


nucleotide 


lenat" h 










secjuence 


peptide 




sequence 


urioritv 






sequence 






application 




494 


2280 


4066 


5852 


784CIP2B_168 


5686 


495 


2281 


4067 


5853 


784CIP2B_169 


5694 


496 


2282 


4068 


S854 


784CIP2B__170 


5698 


497 


2283 


4069 


5855 


784CIP2B 171 


5699 


498 


2284 


4070 


5856 


784CIP2B_172 


5712 


499 


2285 


4071 


5857 


784CIP2B 173 


5719 


500 


2286 


4072 


5858 


784CIP2B 174 


5720 


501 


2287 


4073 


5859 


784CIP2B 175 


5727 


502 


2288 


4 074 


5860 


784CIP2B 176 


5730 


503 


2289 


4075 


5861 


784CIP2B 177 


5734 


504 


2290 


4076 


5862 

W V mm 


784CIP2B 178 


5738 

mf 1 m* V* 


505 


2291 


4077 


5863 


784CIP2B 179 


5739 


506 


2292 


4078 


5864 


784CIP2B 180 


I 5740 


507 


2293 

mm ^ — ' 


4079 


5S65 

WWW 


784CIP2B 181 


5744 


508 


2294 


4080 




784CTP2B 1 ft? 


5748 


509 


2295 


4 081 


5867 


784C1P2B 183 


5749 


510 


*• W 


4 082 


586S 


7fl4rTP?B 1 fl4 




511 


2297 


4Q83 


5P69 


7R4CTP5B IAS 


5*750 

3 / 3 W 


512 


2298 

£t ^ \> 


4084 




7R4PTPPR 1 ftfi 


3 « Jv 


SI 1 




4nR^ 


•JO ' i. 


784PTP2B 1S7 


'5761 

3 / O X 


3X^S 




4ri7f2 


3e l< 




3 / O & 


*>i «; 


Ol 

*S _J Jl 


* w O / 


JO / J 


7R4r , TP9R 1ft<l 


3 / O / 


^i 


*? n*j> 


A ft p ft 




/ OHLlr ZD Xj>U 


3 1/3 


c*! 7 






cm c. 

jO/3 




^"7ft"* 


ft 


*p 1 n/i 


Anon 


30 / O 


/O^L-Jljrwco ±Z7<£ 


CIT ft/1 
3 / O ft 


3x3 


,Z3 U"3 


4Ui*l 


bo /7 


TQ Ar>TD*)D T Q 1 

/ B^LlirjiD^iyj 


3 / O O 




i J ub 


4Ui#-<S 


C Q T O 

bo /a 


/ o^L,l Jr £D^1?9 


C7QO 
3 / SB 


COT 


AT 

ZJU / 


*4Ui70 


C Q 1 O 
3D / 3 




JOU/ 


3ZZ 






CQQA 

booU 


TOT 


C ft 1ft 


3^3 


Tina 


4 uy 3 


c o a i 


t a rirm n -i qo 
/ o^L-jLJr^t* l7Cf 


eft "i a 
3 d x y 


3«9 


■63 J.KJ 


IUjO 


3t)S/ 


/ 0^t^JLlr^£> 179 


3v)6 / 


U ^ Zj 




A f\ Q"7 


3DOO 




CflOft 
_> o x. o 


ZD O 


6 J X j£ 




^Rft4 

3004 


"7R4PTP5R 5Q1 


5842 




OT1 -a 

X — 1 J — J 


A ft QQ 


3 D O 3 


7R4rTP?B OftO 


5853 


3 o 


7 n a 




CQO/r 
3 O O O 


7R4PTP7R 


5861 

—J* O w JL 


529 

~J 9L. J 


^ _> -i. 


4 1 OI 


S887 


784riP2B 204 


5864 


530 

|J *J w 


231G 


4"L02 

t _1_ V/ 


5888 


784CIP2B 205 


5865 


531 

—J -J> -1- 


2"} 1 7 

n *j X # 


41 03 


5889 


784CIP2B 206 


5871 


1 532 

w# —J rfb 


2318 

■mm* mm W 


4104 


5890 


784CIP2B 207 


5873 


533 


2319 


4105 


5891 


784CIP2B 208 


5873 


534 


2320 

JM W 


4106 


5892 


784CIP2B 209 

v «^ • *mm mm mm* mm* mm w ^m r 


5875 


535 


2321 


4107 


5893 


784CIP2B 210 


5878 


536 


2322 


4108 


5894 


784CIP2B 211 

# ^» mmt mm ^mm mm mj& mm> 


5879 


537 


2323 


4109 


5895 


784CIP2B 212 

w my ^mJ m^m mm mtm mtmr mmm mm* ^m 


5880 


53 8 


2324 


4110 


5896 


784CIP2B 213 

w wm m> m^0W mwm mm ^m ^m* b# m^r 


5880 


539 


2325 


4 111 


5897 


784CIP2B 214 

w %^ ms mm* mm mm mm* mtm mm> m> 


5880 


540 


2326 


4112 


5898 


784CIP2B 215 

• mm m mjmim^m ^m mm*^** mm mm* 


5880 


541 


2327 


4113 


5899 


784CIP2B 216 

w ^m' m* flb mm mm mm* ^mm 


5885 


542 


2328 


4114 i 


5900 


784CIP2B 217 

9 * ^^^k mtm mm- m 


5895 


543 


2329 


4115 


5901 


784CIP2B 218 

9> Sdr m mm m mm mm* mm mm 


5898 


544 


2330 


4 US 


5902 


784CIP2B 219 

9 W \m* m\mmm mm* mm* mm JL «7 


5902 

mm* •m r ^m* mm 


i 545 


2331 

^ «J> W -I» 


4117 


*i903 


784CIP2B 220 

/ U T ^» JjV Jit mm**** mm mm w 


5904 


546 


2332 


4118 


5904 


784CIP2B__221 


5918 


| 547 


2333 


4119 


5905 


784CIP2B_222 


5921 


| 548 


2334 


4120 


5906 


784CXP2B 223 


5927 


549 


2335 


4121 


5907 


784CIP2B_224 


5932 


550 


2336 


4122 


5908 


784CIP2B_225 


5939 


551 


2337 


4123 


5909 


784CIP2B 226 


5945 


552 


2338 


4124 


5910 


784CIP2B 227 


S946 


553 


2339 


4125 


5911 


784CIP2B_22B 


5947 


554 


2340 


4126 


5912 


784CIP2B_229 


5956 


555 


2341 


4127 


5913 


784CIP2B_23 0 


5967 
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SEQ ID NO: 
of full- 
length, 
nucleotide 
sequence 



558 



559 



560 



561 



562 



563^ 
"554" 



565 



566 



567 



568 



569 



570 



571 



572 



576 



577 
"578" 



579 



580 



581 



582 



583 



584 
585 



586 



588 



589 
"590" 



591 
"592" 



593 



S94 



595 



596 



597 
598 
599 



TOO 
601 



602 



603_ 
TbT 



605 
"606" 



607 



608 



SEQ ID 
NO: Of 
full- 
length 
peptide 
sequence 



2342 
2343 



2344 



2345 



2346 



2347 



2348 



2349 



2350 



2351 



2352 



2353 



2354 



2355 



2356 



2357 



2358 



2359 
"236 0 
2361 



2362 



2363 



2364 



2365 



2366 
2367 



236T 



2369 



2370 



2371 



2372 



SEQ ID NO: 
of con tig 
nucleotide 
sequence 



2373 



2374 



2375 
23 76 



2377 



2378 



2379 



2380 



23 81 
23 82 



2383 



23 84 
2385 



2386 
23 87 



2388 



23 89 



2390 



2391 



2392 



23 93 



2394 



4128 
4129 



4130 
4131 



4132 



4133 



4134 



4135 



4136 



4137 



4138 



4139 



4140 



4141 



4142 



4143 



4144 



4145 
4146 
4147 



4148 



4149 



4150 



4151 



4152 



4153 



4154 



SEQ ID 

NO: 

of con tig 

peptide 

sequence 



4155 



4156 



4157 



4158 



4159 



4160 



4161 
4162 



4163 



4164 



4165 



4166 



4167 



4168 



4169 



4170 
4171 



4172 
4173 



4174 



4175 



4176 



4177 



4178 



4179 



4180 



5914 
5915 



5916 



5917 
5918 



5919 



5920 



5921 



5922 



5923 



5924 



5925 
5926 



5927 



5928 



5929 



5930 



5931 
5932 
593 3 



5934 



Priori ty 
docket number 
corr e s ponding 
SEQ ID NO: in 
priority 
appl i cat ion 



784CIP2B_232 
784CIP2B 233 



784CIP2B 234 



784CIP2B 235 



784CIP2B 236 



784CIP2B 237 



784CIP2B 238 



784CIP2B 239 



784CIP2B 240 



SEQ ID 
NO: in 
U.S. S.N. 
09/488,725 



784CIP2B 241 



784CIP2B 242 



7 84CIP2B>243 
784CIP2B 244 



78j4CIP2B245 
784CIP2B 246" 



784CIP2B 247 



784CIF2B_248 
784CIP2B_24 9 
784CIP2B_250 
784CIP2B 251 



593 5 



5936 



5937 



S93 8 



5939 



5940 



5941 



5942 



5943 



5944 



5945 



5946 



5947 
5948 



5949 



5950 



5951 



5952 



5953 



5954 



5955 



5956 
5957 



5958 
5959 



5960 



5961 



5962 



5963 



5964 



5965 



5966 



784CIP2B 252 



7B4CIP2B_253 
784CIP2B__254 
784CIP2B 255 



784CIP2B 256 



784CIP2B 257 



784CIP2B 



784CIP2B 



258 
259 



784CIP2B 



784CIP2B 



260 



784CIP2B 262 



784CIP2B 263 



784CIP2B 264 



784CIP2B_265 
784CIP2B 266 



784CIP2B 267 



784CIP2B_26 8 
784CIP2B 269 



784CIP2B 270 



784CIP2B 2 72 



784CIP2B 273 



784CIP2B_274 \ 



784CIP2B_275 
784CIP2B 276 



784CIP2B_277 
784CIP2B 278 



784CIP2B 279 



784CIP2B 280 



784CIP2B 281 



784CIP2B 282 



784CIP2B 283 



784CIP2B 284 



784CIP2B_285 
784CIP2B 286 



5975 
5977 



5978 



5979 



5980 



5988 



5989 



5991 



5997 



5998 



6003 



6004 



6013 



6028 



6028 



6029 



6031 



6031 
6032 
6037 



6037 



6043 



6044 



6046 



6048 



6049 
6051 



6053 



6060 



6063 



6066 



6067 



6068 



6073 
6076 



"6076 



6077 
6079 



6082 



6088 



6091 



6094 



6101 
6103 



6104 
6108 



6112 
"6121 



612S 



6126 



6128 



6129 



6133 
6133 



609 



610 



611 



612 



2395 



2396 



2397 



2398 



4181 



5967 



4162 



5968 



784CIP2B 287 



4183 



5969 



784CIP2B 288 



4184 
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613 
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784CIP2B_291 
784CIP2B 292 
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784CIP2B_295 
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620 
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5981 
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6190 
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5990 
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6239 
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^2 
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2433 
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784CIP2B 330 
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665 
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6334 


671 
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6337 
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2459 
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2467 
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6369 
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6379 
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6381 
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6395 
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2480 
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""I 6422" 
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4272 
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4278 
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2495 
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2499 
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6552 [ 
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2523 


4309 


6095 


784CIP2B_421 


6593 


738 


2524 


i 4310 


6096 


784CIP2B_422 


6595^ 


739 


2525 


4311 


6097 


784CIP2B 423 


6599 
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6100 
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784CIP2B_430 


6633 


747 


2533 
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784CIP2B 431 


6634 
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6646 
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755 
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kJ <-lv W «J 
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6134 
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781 
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782 
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784CIP2B_466 
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mam %p» 
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6141 
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6758 
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6144 
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6890 
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* 
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784CIP2B 506 
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B25 
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827 
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6199 
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6191 
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/Ol / 
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4407 
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6194 


784CIP2B__521 
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6195 


1 784CIP2B__522 
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6199 
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6200 
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peptide . 
sequence 




sequence 


priority 
application 




1052 


2838 


4 624 


6410 


784 CI P2B__74 0 


7832 


10S3 


2839 


4625 


6411 


784CIP2B_741 


7839 


1054 


2840 


4626 


6412 


784CIP2B 743 


7847 


1055 


2841 


4627 


6413 


784CIP2B 744 


7848 


10S6 


2842 


4628 


6414 


784CIP2B 745 


TOO 


1057 


2843 


jb a^ F\ 

4629 


641s 


784CIP2B 746 


78b4 


1058 


2844 


4 630 


6416 


784 CI P2B 74 7 


7856 


1059 


2845 


4631 


6417 


784CIP2B 74 8 


rob/ 


1060 


2846 


4632 


6418 


784CIP2B 749 


7 865 


1061 


2847 


4633 


6419 


784CIP2B 750 


7 874 


1062 


2848 


4634 


6420 


784CIP2B_75l 


7877 


1063 


2849 


4635 


6421 


7B4CIP2B__752 


7880 


1064 


2850 


4636 


6422 


784CIP2B_753 


7882 


1065 


2851 


4637 


6423 


784CIP2B_754 


7884 


1066 


2852 


4638 


6424 


784CIP2B_755 


7886 


1067 


2853 


4639 


6425 


784CIP2B_756 


7888 


1068 


2854 


4640 


6426 


784CIP2B_757 


7889 


1069 


2855 


4641 


6427 


784CIP2B 758 


7901 


1070 


2856 


4642 


6428 


784CIP2B_759 


7910 


1071 


2857 


4643 


6429 


784CIP2B_760 


7911 


1072 


2858 


4644 


6430 


784CIP2B_76l 


7921 


1073 


2859 


4645 


6431 


7B4CIP2B_762 


7923 


1074 


2860 


4646 


6432 


784CIP2B 763 


7924 


1075 


2B61 


4647 


6433 


784CIP2B_J764 


7925 


1076 


2862 


4648 


6434 


7B4CIP2B_765 


7928 


1077 


2B63 


4649 


6435 


784CIP2B_766 


7929 


1078 


2864 


4650 


6436 


784CIP2B_767 


7930 


1079 


2865 


4651 


6437 


784CIP2B_768 


7934 


1080 


2366 


4652 


6438 


784CIP2B_769 


7938 


1081 


2367 


4653 


6439 


784CIP2B 770 


7942 


1082 


2868 


4654 


6440 


784CIP2B_771 


7945 


1083 


2869 


4655 


6441 


784CIP2B_772 


7946 


10B4 


2870 


4656 


6442 


784CIP2B_773 


794 8 


10B5 


2871 


4657 


6443 


784C1P2B_774 


7951 


1086 


2872 


4658 


6444 


784CIP2B_775 


7952 


1087 


2873 


4659 


6445 


784CIP2B_776 


7953 


1088 


2874 


4660 


6446 


784CIP2B__777 


7954 


1089 


2875 


4661 


6447 


784CIP2B_778 


7957 


1090 


2876 


4662 


5448 


784CIP2B_779 


**y n r* n 

7958 


1091 


2877 


4663 


6449 


784CIP2B 780 


7961 


1092 


2878 


4664 


6450 


784CIP2B_ - 7 81 


7965 


1093 


2879 


4665 


6451 


7 84CIP2B_782 




1094 


2880 


4666 


6452 


784CIP2B_7B3 


7979 


1095 


2881 


4667 


6453 


784CIP2B 7 B4 


1 o O f 

79H6 


1096 


2882 


m *m- aat A-\ 

4668 


6454 


784CIP2B 785 




1097 


28B3 


4669 


6455 


784CIP2B 78b 


*7 ^ O Q 

7988 


1098 


2884 


4670 


6456 


784CIP2B 787 


T QOf 


1099 


2885 


4671 


6457 


784CIP2B_7 8 8 


/ yy/ 


1100 


2886 


4672 


6458 


7 8 4 CI P2B__7 8 9 


n Art** 

7992 


| 1101 


2887 


4673 


6459 


784CIP2B_790 


7992 


1102 


2888 


4674 


6460 


784CIP2B_791 


7992 


1103 


2889 


4675 


64 61 


784CIP2B 792 




11U4 


2890 


4675 








1105 


2891 


4677 


6463 


784CIP2B_794 


8015 


1106 


2892 


4678 


6464 


784CIP2B 795 


8016 


1107 


2893 


4679 


6465 


784CIP2B_796 


8017 


1108 


.28 94 


4680 


6466 


784CIP2B_797 


8019 


1109 


289S 


4681 


6467 


784CIP2B — 798 


8020 


1110 


2896 


4 682 


6463 


784CIP2B 799 


8022 


1111 


2897 


4683 


6469 


784CIP2B 800 


8 022 


1112 


2898 


4684 


6470 


784CIP2B 801 


8028 


1113 


2899 


4685 


6471 


784CIP2B 802 


8030 
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Opo TO NO * 

of full- 
length 
nucleotide 
sequence 


NO ■ nf 

full- 
length 
peptide 
sequence 


-LU N(JZ 
<».\jr»i.xy 
nucleotide 
secruence 


SBQ IO 
ti\J : 
wX 

Dent i Hp 

sequence 


Px-iori ty 
docket number_ 
corre sponaing 

aty xiv wu. in 
nriori fv/ 1 

aDDlication 


SEQ ID 
NO: in 

X T COM 

n a /a qo Tic 


1114 


2900 


4686 


6472 


784CIP2B 803 


8038 


X11S 


2901 


4687 


6473 


784CIP2B 804 


anno 


I 1116 


2902 


4688 


6474 


784CIP2B 805 

' J. XT 4b>0 O w 


R f>4 c: 

OU*f J 


1117 


2903 


4689 


6475 


784CXP2B 806 




1118 


2904 


4690 


6476 


784CIP2B 807 


flf>4^ 


1119 


2905 


4691 


6477 


784CIP2B 808 


8047 


1120 


2906 


4692 


6478 


784CIP2B_809 


8051 


1121 


2907 


4693 


6479 


784CIP2B 810 


8059 


1122 


2908 


4694 


6480 


784CIP2B 811 


6064 


1123 


2909 


4695 


6481 


784CIP2B_812 


8069 


1124 


2910 


4696 


6482 


784CIP2B 813 


8074 


1125 


2911 


4697 


64 83 


784CIP2B 814 


8077 


I 1126 


2912 


4698 


6484 


784CIP2B 815 


8078 


1127 


2913 


4699 


64 85 


784CIP2B 816 


8079 


1128 


2914 


4700 


6466 


784CIP2B_817 i 8084 


1129 


2915 


4701 


6487 


784CIP2B_818 j 8088 


1130 


2916 


4 702 


648 B 

V <mf *J 


784C1P2B_819 | 8090 


1131 


2917 


4703 


£4 8 9 

U J 


784CIP2B 820 


8091 


1132 


2918 


4 704 


CflQf) 


784CIP2B 821 


8099 


1133 


2919 

4b ^ L ^ 


470R 


CAQ "I 
O^* J X 


784CIP2B 822 


8099 


1134 


2920 


4706 




784CIP2B 823 


8100 


i 1135 


2921 






784CIP2B 824 


8102 


1136 


-S 






784CIP2B 825 


8103 


■1137 


297"^ 


A 7 fl Q 




784CIP2B 826 


8103 


113S 




4 71 n 

*± / xu 






8104 


1139 


292 5 
^ _j 


4 71 1 


C/% Q7 


O ^ T nnn q a 

/o4<_IPzB o2a 


810 8 


1140 


2926 

4b ^ 4b W 


/ xa 




/B4CIP2B 829 


8110 


1141 


2927 


4. 71 3 




7 84CIP2B 830 


8116 


1142 


2929 


4 7 1 A 
^» / x** 


Djwu 


/o4CIr2B 831 


8117 


1143 


2929 

A* -.4 4U ^7 


471c 

x / XO 


o -> u> X 


784CIP2B 832 


8123 


1144 


293 0 


*t f X O 




784CIP2B_833 


8130 


1145 


2931 


4 71 7 




784CIP2B_834 


8130 


1146 


2932 


4718 




784CIP2B 835 


8143 


114 7 


2933 


4 719 


O 3 W 3 


784CIP2B_836 


8143 


1148 


2934 


4 7?0 


c rnc 
0 3 u 0 


784CIP2B_837 


8154 


1149 


2935 


4 771 


v J U / 


784CIP2B_838 


8155 


1150 


2936 


4 777 


O J \J O | 


784CIP2B_839 


8162 


1151 


293 7 


4 77"% 


0 J w J 


784CIP2B_840 


8163 


! 1152 


2938 


4 724 


6510 

O *J X 


784CIP2B_841 


8172 


1153 


2939 

» ^ 


4725 


6511 


784CIP2B 842 


8173 


1154 


2940 


4726 


6512 


784CIP2B_843 


8179 


1155 


2941 

•* » * Jb 


4 727 


6513 


784C1P2B_844 


8182 


1156 


2942 


4728 


6514* 


784CIP2B_845 


8183 


1157 


2943 


4729 


6515 


784CIP2B 846 


8184 


1158 | 


2944 


4730 


6516 


784CIP2B_847 


8185 


1159 


2945 


4731 


6517 


784CXP2B 848 


8187 


*~ 1160 


2946 


4732 


6518 


784CIP2B_849 


8188 


1161 


294 7 


4733 


6519 


784CIP2B__850 


8190 


1162 


294B 


4734 


6520 


784CIP2B_851 


8190 


1163 


2949 


4735 


6521 


784CIP2B_852 


8192 


1164 


2950 


4736 


6522 

V mm ml* 


784CIP2B 853 


8193 


1165 


2951 


4737 


6523 


784CIP2B_854 


8197 


1166 


2952 


4738 


6524 


784CIP2B_855 


8197 ! 


1167 


29S3 


4739 


6525 


784CIP2B_856 


8199 


1168 


2954 


4740 


6526 


784CIP2B 857 


8202 


1169 


2955 


4741 


6527 


784CIP2B_858 


8203 


1170 


2956 


4742 


6528 


784CIP2B_859 


8208 


1171 


2957 


4743 


. 6529 


784CIP2B_860 


8209 


1172 


2958 


4744 j 


6530 


784CIP2B_861 


8211 


1173 


2959 


4745 


6531 


784CIP2B_862 


8214 


1174 


2960 


4746 


6532 


784CIP2B_863 


8217 


1175 j 2961 


4747 


6533 


784CIP2B 864 


8223 



289 



01 5331 2A 11 _> 



WO 01/53312 



PCT7US00/34263 



SEO ID NO: 
of full- 
length 
nucleotide 
sequence 


SEQ ID 
NO: of 
full- 
length 
peptide 
sequence 


SEQ ID NO: 
of cont ig 
nucleot ide 
sequence 


SEQ ID 

NO i 

of contig 

peptide 

sequence 


Priority 
docke t nurabe r_ 
cor re s ponding 
SEQ ID NO: in 
priority 
application 


SEQ ID 
NO: in 
U.S. S.N. 
09/488, 725 


1176 


.2962 


4748 


6534 


784CIP2B_865 


8224 

fs *"* 


1177 


2963 


4749 


SS35 


784CIP2B_866 


8226 ■ 


1178 


2964 


4750 ^ 


6536 


784CIP2B 86/ 


8227 


1179 


2965 


4751 


6537 


784CIP2B_868 


OZ 2. 


1180 


2966 


4752 


6538 


7 84CIP2B_869 


o j£ J A 


j 1181 


2967 


4753 j 


6539 


784CIP2B_B70 


8236 


1182 


2968 


4754 


6540 


784CIP2B 871 


8239 


1183 


2969 


4755 


6541 


784CIF2B__872 


A ^ Jt Jt 

8244 


1184 


2970 


4756 


6542 


784CIP2B_873 


8245 


1185 


2971 


4757 


6543 


784CIP2B_874 


8248 


1186 


2972 


4758 


6544 


784CIP2B_B75 


825jl. 


1187 


2973 


4759 


6545 


784CIP2B_876 


8253 


1188 


2974 


4760 


6546 


784CIF2B_877 


8260 


1189 


2975 


4761 


6547 


784CIP2B_878 


8262 


1190 


2976 


4762 


6548 


784CIP2B_879 


8268 


1191 


2977 


4763 


6549 


784CIP2B_880 


8270 


1192 


2978 


4764 


6550 


784CIP2B_B81 


8272 


1193 


2979 


4765 


6551 


784CIP2B_882 


8274 


1194 


2980 


4766 


6S52 


784CIP2B_883 


8274 


1195 


2981 


4767 


6553 


784CIP2B_884 


8275 


1196 


2982 


4768 


6554 


784CIP2B_885 


8277 


1197 


2983 


4769 


6555 


784CIP2B_8B6 


8281 


1198 


2984 


4770 


6556 


784CIP2B_887 


8283 


1199 


2985 


4771 


6557 


784CIP2B_888 


8289 


1200 


2986 


4772 


6558 


784CIP2B_889 


8295 


1201 


2987 


4773 


6559 


784CIP2B_890 


8300 


1202 


2988 


4774 


6560 


784CIP2B_891 


8303 


1203 


2989 


4775 


6561 


7B4CIP2B__892 


8304 


1204 


2990 


4776 


6562 


784CIP2B_893 


8305 


1205 


2991 


4777 


6563 


784CIP2B_894 


8309 


1206 


2992 


4778 


6564 


784CIP2B_895 


8318 


1207 


2993 


4779 


6565 


784CIP2B 896 


8319 


1208 


2994 


4780 


6566 


784CIP2B_897 


8321 


1209 


2995 


4781 


6567 


784CIP2B_898 


8322 


1210 


2996 


4782 


6568 


7B4CIP2B_899 


8323 


1211 


2997 


4783 


6S69 


784CIP2B_900 


8325 


1212 


2998 


4784 


6570 


784CIP2B 901 


8331 


j 1213 


2999 


4785 


6571 


784CIP2B_902 


8332 


1214 


3000 


4786 


6572 


784CIP2B_903 


8333 


1215 


3001 


4787 


6573 


784CIP2B_904 


B335 


1216 


3002 


4788 


6574 


~~ 784CIP2B 905 


8336 


1217 


3003 


4789 


6575 


784CIP2B_906 


8337 


1218 


3004 


4790 


6576 


784CIP2B 907 


8340 


1219 


3005 


4791 


6577 


[ 784CIP2B_908 


8343 


1220 


3006 


4792 


6578 


784CIP2B 909 


8347 


1221 


3007 


4793 


6579 


784CIP2B_910 


8349 


1222 


3008 


4794 


6580 


784CIP2B_911 


o "a c -i 
8351 


1223 


3009 


4795 


6581 


784CIP2B_912 


o i c "a 


1224 


3010 


4796 


6582 


784CIP2B_913 


8355 

I a *a ^ ' 


1225 


3011 


4797 


6583 


784CIP2B 914 


8361 


1226 


3012 


4798 


6584 


784CIP2B_915 


a o c c 

8365 

f> ^> r-J 


1227 


3013 


4799 


6585 


784CIP2B 916 


8367 


122 8 


3014 


4800 


6586 




i 8369 

w J u J 


1229 


3015 


4 801 


6587 


784CIP2B_919 


8375 


1230 


3016 


4802 


6588 


784CIP2B_920 


8387 


1231 


3017 


4803 


6589 


784CIP2B_921 


8391 


1232 


3018 


4804 


6590 


784CIP2B 922 


8393 


1233 


3019 


4805 


6591 


784CIP2B_923 


8393 


1234 


3020 


4806 


6592 


784CIP2B_924 


8394 


1235 


3021 


| 4807 


6593 


784CIP28_925 


8395 


1236^ 


! 3022 


4808 


6594 


784CIP2B 926 


8396 


1237 


3023 


4809 


6595 


784CIP2B 927 


8398 
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SEQ ID NO: 
of full- 
iengtn 

UUCJcOLlUc 

OCUUCULC 


SEQ ID 
NO: of 
XUll- 

iengtn 
peptioe 


SEQ ID NO: 
Of Con tig 
nucleotide 
sequence 


SEQ ID 

NO: 

of contig 

peptide 

sequence 


| Priority 
docket number^ 
corre spond i ng 
SEQ ID NO: in 
priorxty 
application 


SEQ ID 
NO: in 
D.S.S.N. 
09/488,725 


1238 






bDJb 


1 784CIP2B 928 


8402 


JL Cm J 


J V A J 


4ft1 1 
** Oil 


c c a *j 


1 784CIP2B__929 


8402 | 


1240 


3026 


4 81 2 


0370 


1 784CIP2B 930 


8405 ) 


1241 


3027 


apt ** 

^ O JL J 


r cob 


1 *7 rt >^** y% rt m rt\ ~k « 

1 784CIP2B 931 


6406 | 


1242 


3 028 


4.81 4 

^ OX** 


c£ rt rt 
DO u u 


1 784CIP2B 932 


8409 


1243 


30OQ 


481 ^ 


r<r/n 


1 —f rt rt rt y"> rti rt rt 

J /84CJP2B 933 


8410 | 


1244 


3030 


4 81 A 


obuz 


1 rt rt /• ^* t T\ rt T*s rtjrt m. 

\ 7B4CIP2B 934 


j 8414 . | 


1245 


3 031 


4 fll *7 
** OX / 


obUJ 


| 784CIP2B 935 


8415 j 


1246 


J V J t 


4 RI P 
*± O J. O 


c c rt vi 
bb U4 


1 rt) n Jt T TN rt T> rti rt *~ 

\ 784CIP2B 936 


8419 | 


1247 

X £• S 


-J I/O J 




/•£ AC 

6605 


| 784CIP2B_937 


8426 | 






4o2U 


bb 06 


j 784CIP2B 938 


8430 


x Ark j 




4 o21 


6607 


j 784CIP2B_939 


8431 J 




JUJD 


4 o22 


6608 


j 784CIP2B_940 


8432 




-J uO / 


4823 


6609 


j 784CIP2B_941 


8433 j 


X 4 


*a n "> 0 
j U J O 


a no a 

4 B24 


6610 


| 784CIP2B_942 


8434 I 




*3 A *> a 


4B25 


6611 


| 784CIP2B_943 


8438 I 




3040 


4826 


6612 


j 784CIP2B_944 


j 8439 j 


i o c c 


3 041 


4827 


6613 


j 784CIP2B_945 


8441 J 


1^56 


3042 


4828 


6614 


j 784CIP2B_946 


8450 j 


12 57 


rt- rt 4 rt 

3043 


4829 


6615 


j 784CIP2B_947 


84S1 | 


1258 


rt rt 

3044 


4830 


6616 


784CIP2B_94 8 


8452 


1259 


rt rt 4 r~ 

3045 


4831 


6617 


[ 784CIP2B_949 


8460 1 


1260 


3046 


4 832 


6618 


| 784CIP2B_950 


8461 j 


^ rt «*- a 

1261 


304 7 


4 833 


6619 


j 784CIP2B_951 


8462 j 


1262 


3048 


4834 


6620 


784CIP2B 952 


8464 • 1 


1263 


3049 


4835 


6621 


1 784CIP2B 953 


8465 j 


1264 


3050 


4836 


6622 


: 784CIF2B 954 


8467 | 


^ rt, r"" r~ 

1265 


3051 


4 837 


6623 


784CIP2B — 955 


8470 


1266 


3052 


4838 


6624 


784CIP2B_956 


8471 


1267 


3053 


4839 


6625 | 


784CIP2B_957 


8473 


126 8 


3054 


4840 


6626 


784CIP2B_958 


8474 | 


1269 


3055 


4 841 


6627 


784CIP2B 959 


8475 


1270 


0 rt tr ^* 

3056 


4842 


6628 


784CIP2B 960 


8476 


1271 


3057 


4843 


6629 J 


784CIP2B_961 


8480 j 


12 72 


rt. rt r~ rt 

3058 


4844 | 


6630 


784CIP2B 962 


8482 


1273 


*s rt ^ r> 

3059 


4845 


6531 | 


784CIP2B 963 


8482 | 


1274 


306O 


4846 


6632 j 


784CIP2B_964 


8486 j 


1275 


-it rtV n 

3061 


4847 


6633 ) 


784CIP2B 965 


8408 ! 


12 f b 


3062 


4 848 


6534 j 


784CIP2B 966 


8492 j 


J.2 / / 


■a rt ^ -» 
3063 


4849 


6635 | 


784CIP2B 967 


8494 j 


-L27o 


rt £T 

3064 


n r~ r\ 

4 850 


6636 j 


784CIP2B_968 


8496 | 


T *> *7 Q 


-a n c 


j or 1 

4 851 


6637 j 


784CIP2B 969 


8497 j 




-) c\ r r- 


4 852 


6638 j 


784CIP2B_970 


8499 [ 


1 "> ft 1 


JUD / 


4o53 


r ~t c\ I 

6639 | 


784CIP2B_971 


8513 | 


1 !> ft*? 

X tt O A 


JUDO 


A O C A 

4 034 


664 0 


784CIP2B_972 


rt r - * ft 

8522 I 




"3 rt C Q 


4 obi) 


6641 { 


7 8 4 CI P2B_9 73 


r% r - rt> y~ \ 

8526 1 


1284 

■A. <• o ~x 


•3 mo 


4 flee 


bo 42 I 


rt 4 y^i Tr* T^rtj rt rt> ji 

7B4CIP2B 974 


rt rt 1 

8531 I 


12 85 

X ^ O -J 


JU /X 


A ft t: "7 


6 643 | 


784CIP2B__975 


rt rt rt- 1 

8533 | 


IcOv 




A P CO 
4 ODD 


6644 I 


784CIP2B_976 


8542 


J. « 0 / 


■a rt "i T 


4859 


6645 j 


784CI P2B_977 


8544 J 


1 TQQ 

x 0 0 0 




ji O r a 


6646 


784CIP2B_978 


8565 j 


1 O ft Q. 


*a rti c 


4861 


6647 j 


784CIP2B__979 


8565 j 


1290 




4 ft 0 


b b4 0 i 


/ 0 4 l_l t*2±3 y 8 0 


O CTl 1 

ob / 2 


1291 


3077 


48G3 


6649 | 


784CIP2B 981 


8576 ] 


1292 


3078 


4864 


6650 j 


784CIP2B_982 


8578 


1293 


3079 


4865 


6651 j 


784CIP2B_983 


8584 j 


1294 


3080 | 


4866 


6652 j 


784CIP2B_984 


8598 J 


1295 


3081 


4867 


6653 ] 


784CIP2B_985 


8602 1 


1296 


3082 


4868 


6654 | 


784CIP2B 986 


8604 j 


1297 


3083 


4869 


6655 J 


784CIP2B_987 


8609 } 


1298 


3084 


4870 


6656 j 


784CIP2B_988 


8612 1 


1299 


3085 


4871 


6657 j 


784CIP2B 989 


8637 | 
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SEQ ID NO: 
of full- 
length 
nucleotide 
sequence 


SEQ ID 
NO: of 
full- 
length 
peptide 
sequence 


SEQ ID NO: 
of contig 
nucleotide 
sequence 


SEQ ID 

NO: 

of contig 

peptide 

sequence 


Priority 
docket number_ 
corresponding 
SEQ ID NO: in 
priority 
appucacion 


SEQ ID 
NO : in 
U . S -S .M. 
09/488, /25 


1300 


3086 


4872 


ft 

6658 






1301 


3087 


4 873 


c ft 

6659 


/ U4( L FzrJ 731 


O u *: J 


1302 


3088 


4 874 


66 6 U 




8645 1 


1303 


3089 


4875 


bob! 




8650 


1304 


3090 


m ft *y f~ 

4876 


oobi 


/Of* A r £D J J H 


8651 


1305 


3091 


4877 


6663 


TQ/t f*»T t> "> O QQC 


8654 


1306 


3092 


4878 


6664 




8655 

O W —J 


1307 


3093 ! 


4879 


^ c c 


/ tf^UXr^s 73 / 


8657 


1308 


3094 


4880 


^ f f r~ 

6666 






1309 


3095 


4881 


6667 


/o4LlrZo 71717 


o o w o 


1310 


3096 


4882 


m~ r% 

5668 


/OlLlrcD 1UUU 


OCT! 


1311 


3097 


4883 


6669 


/o4\- J. V<li> 1UUX 


ft K"77 


1312 


3098 


4884 


6670 


784CIP2B_1002 


8692 


1313 


3099 


4885 


6671 


784CIP23_1003 


87C6 


1314 


3100 


4886 


6672 


784CIP23 1004 


8716 


1315 


3101 


4887 


6673 


784CIP2B__1Q05 


8719 


1316 


3102 


4888 


6674 


784CIP2B_1006 


8743 


1317 


3103 


4889 


6675 


784CIP2B 1007 


8764 


131B 


3104 


4890 


6676 


7B4CIP2B 100B 


8764 


1319 


3105 


4 891 


6577 


784CIP2B 1009 


8764 


1320 


3106 


4892 


6678 


7S4CIP2B_1010 


8774 


1321 


3107 


4 893 


6679 


784CIP2B_1011 1 8782 


1322 


3108 


4894 


6680 


784CIP2B_10l2 


8 i 96 


1323 


3109 


4895 


6681 


784CIP2B 1013 




1324 


3110 


4896 


6682 




8842 


1325 


3111 


4897 


6683 


784CIP2B_1015 


8842 


1326 


3112 


4898 


6684 


784CIP2B_1016 


8858 


1327 


3113 


4899 


6685 


784CIP2B_1017 


8 871 


1328 


3114 


4900 


6686 


784CIP2B_1018 


8921 


1329 


3115 


4901 


6687 


784CIP2B_1019 


8927 


1330 


3116 


4902 


6688 


784CIP2B_1020 


8942 


1331 


3117 


4903 


6689 


784CIP2B_1021 


8994 


1332 


3118 


4904 


6690 


784CIP2B 1022 


9023 


1333 


3119 


4905 


6691 


784CIP2B_1023 


9028 


1334 


3120 


4906 


6692 


784CIP2B_1024 


9058 


1335 


312X 


4907 


6693 


784CIP2B_1025 


9058 


1336 


3122 


4908 


6694 


784CIP2B 1026 


9079 


1337 


3123 


4909 


6695 


784CIP2B_1027 


9079 


1338 


3124 


4910 


6696 


784C2P2B_1028 


9082 


1339 


3125 


4911 


6697 


784CIP2B_1029 


9084 


1340 


3126 


4912 


6698 


784CIP2B_1030 


9093 


1341 


3127 


4913 


6699 


784CIP2B_1031 


9101 


1342 


3128 


At f\ *t A 

4914 


f r\ r\ 


784CIP2B_1032 


9103 


1343 


3129 


jk ft ^ 

4915 


C "7 A 1 

6 f ul 


784CIP2B_1033 


9105 


1344 


3130 


4916 


6 7 UZ 


784CIP2B_1034 


9151 


1345 


3131 


a r% t ~y 

4917 


o / Uj 


784CIP2B__1035 


9161 


1346 


3132 


jk ft ^ O 


o /U4 


784CIP2B_1036 


5172 


1347 


3133 


4919 


fa / Uz> 


784CIP2B_1037 


9174 


1348 


3134 


4920 


6705 


784CIP2B_1038 


9204 


1349 


3135 


4921 


h ^ ""1 ft 

6707 


784CIP2B_1039 


9234 


1350 


3136 


4922 


y *^ ft r> 

6708 


784CIP2B_1040 


9235 


1351 


3137 


4923 


6709 


784CIP2B_1041 


9239 










784CIP2B_1042 


9256 


1353 


! 3139 


4925 


6711 


784CIP2B_1043 


9276 


1354 


3140 


4926 


6712 


784CIP2B_1044 


9345 


1355 


3141 


! 4927 


6713 


784CIP2B_1045 


9379 


1356 


3142 


4928 


6714 


784CIP2B_1046 


9435 


1357 


3143 


4929 


6715 


784CIP2B_1047 


9437 


1358 


3144 


4930 


6716 


784CIP2B 1048 


9469 


1359 


3145 


4931 


6717 


784CIP2B 1049 


9500 


1360 


3146 


4932 


" 6718 


784CIP2B 1050 


9502 


1361 


3147 


4933 


6719 


784CIP2B 1051 


9520 



292 



Rwsnnnr><wo oi533i2Ai_L> 



t 



WO 01/53312 PCI7US00/34263 



SEQ ID NO: 
OI IU11 - 
XtiJly CXJ 

sequence 


SEQ ID 
no : or 

•ftil 1 — 

IU11 - 
1 r* norf" ?^ 

Deofc ide 
sequence 


SEQ ID KO : 
ot con tig 

niir*» 1 Ai*«\T~ j-h 


SEQ ID 

KO : 

of con tig 

pCptlUB 
QCy^iiC- lot: 


1 Priority 

docket number_ 
1 corresponding 

j priority 


SEQ ID 
H0: in 
U.S. S.N. 
09/ 488 , 725 


1362 


314 8 


4934 


6720 


1 ' Ort^JLr^O Xv^4> 


O^A 1 
3341 


1363 


3145 


4935 


6721 


1 784CIP7H 1 053 


-»3H1 


1364 


3150 


4936 


6722 


1 784CIP2B 1 f!5d 


qc&a 


1365 


3151 


4937 


6723 


1 784CIP2B 1055 




1366 


3152 


4938 


6724 


I 784CIP2B 1056 


occr 


1367 


3153 


4939 


6725 


784CIP2B 1057 




1366 


3154 


4940 


6726 


1 784CIP2B 1 05R 


i ocaq 

7J07 


1369 


3155 


4941 


6727 


J 78 4CIP2B 1059 


QCQQ 
J J _7 _7 


1370 


3156 


4 942 


6728 


1 784CIP2B 1060 


O V/ « 


1371 


3157 


4943 


6729 


1 784CIP2B lOfil 


q n ^ 


1372 


3158 


4944 


6730 


I 7H4fTP?R vnfi7 


70£6 


1373 


3159 


4945 


6731 




^ 0 J. 5 


1374 


3160 


4 946 


6732 


1 7flAPTD7B T ACA 


• QCiC 


1375 


3 161 


4 94 7 

T J? ~ / 


S733 


1 7flATTD7n i nCK 


17 / 4 / 


1376 


J -L v> a> 


d 9d R 


f%73 d 


! 7oii ^*Tmr3 i occ 


y / / j 


13 77 


3163 

—J J. U ~J 


d 9d Q 


O / JD 


1 "7D/J PTDTQ i nC7 
1 /OSLlfjiD JLUb / 


y /oi> 


1378 


3154 






1 /o4Clr^iJ__lObo 


a 0 r\ ^ 
9801 


1379 


3 1 £5 




D / J / 


j /84ClP21i__I0oy 


9811 


i JO w 


31 fif; 




D / JO 


1 /o4LlP2B 1070 


9843 


X J OX 


J 1 D / 




b / J3 


j 784CIP2B 1071 


9854 


1 3 ft"? 


JlOO 




O / 4 U 




a n r i 

9 854 


1 3 ft3 






o /91 


/84CXP2B_i073 


9864 


T 3 fld 


Jl / u 




C T A ^ 


/ 84CIP2B_1074 


3864 


J. J O J 


1 / X 




CIA 1 


/ o 4 CI P ZB__1 0 7 5 


9871 


AO CO 


n *7 "> 

3 1 / A 






/B4CIP2d^107o 


%~ ft ft t ft E 

9879 


X J O / 


J 1 / J 




»J yj r" 


/fl4CIP2B_^i077 


ft ft A ^ 

9881 




-a i *7 it 


4you 


f Hi) /• 
O 1 H. b 


| 784CIP2B_1078 


A A A P* 

9885 


1 1 J u 27 


O X / 3 


4 30JL 


o / 


1 Y84CIP2B 1079 


ft ft ft ^ 

9301 


1 3 40 


J 1 / O 




C *7A R 


"TOii r*»T mo t n o a 

/o4LlrZ0 lOoO 






T "7 "7 

JJL/ / 






/o4LXP2B lOol 


nfti r* 

9916 




TO 

J — / o 






/o4LXP*Jd 10o2 


99 Z 1 






43D3 


b / j1 j 


7 84CIP2B 10 83 


9925 


T 3 9A 


-s n o 
JlOU 




b /a^ 


/ oflLlf^D^l U B4 


yyj u 




Jlol 


A QC7 


b rjJ 


7o/ir , TD')a -t rt o c 
/o4Clr^o 1 Usb 


994 9 


JL J JO 


•31 

Jlui- 


A Q C Q 




/o4V-li'Ao lUob 


99 bl 


x j j» / 


A 1 O .3 


A QC O 




/ tJ4ClrzD 1 0o / 


Q Q C Q 

9359 


T 3 Oft 
_l _> _y o 




A Q7n 


o /Do | 


/ o4 v^lir ax» 1U o 0 


q a -7 -j 


1399 


■J1 QC 
J J.O J 


d Q71 
17 ' 1 


CTC7 } 


"7 Q A PT DO Ca 1 nQQ 


y y 0 a 


1 4 on a/. 


3 "1 ft fi jX 
j x o o >iy 


d Q75 


O /DO 1 


a u,/04ulrAO 1U7U 


QQ Q A 

yy y u 


1401 — ^ 


3187 


4 973 


C7CQ j 


'Ot^lr^Q 11/71 


1 nn7i 

lu U AX 


1402 


3188 


4 974 


67CO I 




X v> U*s X 


14 03 


3189 


4 975 


S7S1 i 




100fi7 

IvUD / 


1404 


3190 


4976 


6* 762 i 


7B4C1P2B 1095 


10O73 

JL V v r «J* 


14 05 


3191 

^L ^L 


4 977 


67S3 1 


7B4CIP2B lQQg 


1 0ri2 


14 06 


3192 


4978 


6764 1 


784CIP2B 1097 


10117 i 

^L W -1^ Mk # 


1407 


3193 


4979 


6765 1 


784CIP2B 1098 


10132 


1408 


3194 


4 980 


6766 j 


784CIP2B 1099 


1016^9 ! 


14 09 


3195 


4981 


6 767 f 


784CIP2B 1100 


10217 


1410 


3196 


4982 


6 768 1 


7B4CIP2B 1101 

» U V* 1 i-» JL JL W JL 


10226 

JL V 4M w 


1411 


3197 


4983 


6769 1 


784C1P2B 1102 


10232 


1412 

<JW » *a» A* 


3198 

^jw _y u 


d Q ft d 


c "770 s 

O / / \J I 


' OuUJLrZD XJ.U J 


10237 


1413 

A » JL ™» 


3199 


H J O 3 


C 7 7 T f] 
o / / 1 | 


7Q4PTDOH iifii 


1 u a / y 


1414 


3200 


4986 


6772 I 


784CIP2C 1 


33 


1415 


3201 


4987 


6773 J 


784CIP2C_2 


271 


1416 


3202 


4988 


6774 j 


784CIP2C 3 


848 


1417 


3203 


4989 i 


6775 J 


784CIP2C 4 


849 


1418 


3204 


4990 


6776 j 


784CIP2C_5 


864 


1419 


3205 


4991 


6777 | 


7 84CIP2C_6 


953 


1420 


3206 


4992 


6778 | 


784CIP2C_7 


980 


1421 


3207 


4993 


6779 | 


784CIP2C 8 


1595 


1422 


3208 


4994 


6780 1 


784CIP2C_9 


1697 


1423 


3209 


4995 


6781 j 


784CIP2C_10 


1744 
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SEQ ID NO: 

of full- 
length 
nucleotide 
sequence 


SEQ ID 
NO: of 
full- 
length 
peptide 
sequence 


SEQ ID NO: 
of contig 
mxcl eoti de 
sequence 


SEQ ID 
NO: 

of contig 

peptide 

sequence 


Priority 
docket number _ 
corresponding 
SEQ ID NO: in 
priority 

application | 


SEQ ID 

NO : in 

U . S . S . N . 

Ao/dflfl 725 
U i) / "i 00 , / x. 


' 1424 


3210 


4996 


O tOA 




1937 


1425 


3211 


4 997 


© /OJ 


TftofTP?r 12 
/ o *i r a. \- -A. «t 


1955 


1426 


3212 


4998 


b /Ci 




1955 


1427 


3213 


4999 


CIRC 




2185 


1428 


3214 


5O0U 


O / o o 


"784CIP2C IS 


2889 


1429 


3215 


500X 


D / O / 


784CIP2C 16 


2901 


1430 


3216 


r~ /~\ ~i 
5O02 


O / DO 




2902 


1431 


3217 


5003 


O / D J 


7B4CIP2C 18 


2905 


1432 


3218 


i>UU4 




784CIP2C 19 


2946 


1433 


3219 


F— f\ f— 

5005 


O '3* 


# 7fl4f , TP2C 20 


2956 


1434 


3220 


5006 I 


D / J-i 




2959 


143S 


3221 


S007 


b / 3J 




2965 


1436 


3222 


5008 


o /ir*i 




2966 


1437 


3223 


5009 


6795 


/OftL.1- <9 


2970 


1438 


3224 


5010 


6736 






1439 


3225 


5011 


#T" *1 c% —i 

6797 




29B7 


1440 


3226 


5012 


6798 






1441 


3227 


5013 


6799 


784CIP2C 28 




1442 


3228 


5014 


6800 


784CIP2C 29 


JVl f 

■i A A C 


1443 


3229 


5015 


6801 


784CIP2C_3 U 


•a Kt^n 


1444 


3230 


5016 


6802 


784CIP2C 31 




1445 


3231 


5017 


6803 


784CIP2C__32 




1446 


3232 


5018 


6804 


784CIP2C 33 




1447 


3233 


5019 


6805 


784CIP2C 34 


Til *31 7 1 
J ** J X 

•5 V-a 0 


1448 


3234 


5020 


6806 


784CIP2C__35 


j *» j 0 


1449 


3235 


5021 


6807 


784CIP2C 36 




1450 


3236 


5022 


6808 


784CIP2*-_33 




1451 


3237 


5023 


6809 


784CIP2w_4w 


J *± D D 


1452 


3238 


5024 


6810 


784CIP2C_4 ± 


J> •« 0 0 


1453 


i 3239 


5025 


6911 


784ClP«tt-. 




1454 


3240 


5026 


6312 




746 8 
J tt a u> 


1455 


3241 


5027 


6813 


yo4C_!i*«L. 




1456 


3242 


5028 


6814 




14H4 


1457 


3243 


5029 


6815 




"14 ft 8 


1458 


324 4 


5030 


6816 


7o4v_lir^C 4 / 


J ^ ^ JV 


1459 


3245 


5031 


6817 


7B4Cllr4it. *o 




1460 


324 5 


5032 


6818 


784 t-lF^it- ^ I? 


"*4 94 


1461 


3247 


5033 


6819 


TD/lTTPOr IZO 


3495 


14 62 


3248 


5034 


6820 


/o4(vlr<cLi 31 


3496 


1463 


3249 


5035 


C O 1 1 

b 046 J. 




3503 


1464 


I 3250 


5036 


^ Q O O 




3503 


1465 


3251 


503/ 


^ o o ^ 


/ O y \ — L Jr -C ^ 3*4 


3504 


1466 


3252 


5038 


O *5 /i 

© a a*i 


/ Ot-V Lri.U 


3511 


1467 


3253 


5039 


£. a o c 




3531 


1468 


3254 


504 0 


C HOC 




3536 


[ 1469 


3255 


5041 


btsz / 




3546 


1470 


3256 

* 




coop 

D O 4* O 


! 7B4CIP2C 59 


3548 


1471 


3257 


c r\ a i 
5043 


£ A2 Q 


784CIP2C 60 


3551 


1472 


3258 


CJU 4 ■* 


ft'* ft 
DOJV 


784CIP2C 61 


3553 


1473 


3259 


504i> 


o a j JL 


784CIP2C 62 


3564 


1474 


3260 


bU4b 




784CIP2C 63 


3 567 


14 75 


3261 


5U4 r 




784CIP2C 64 


3572 j 


l*x * © 


"*7fi2 


504 8 


6834 


784CIP2C_65 


3573 


1477 


3263 


5049 


6835 


784CIP2C_6 6 


3574 ! 


1478 


3264 


5050 


6836 


784CIP2C 67 


3583 


1479 


3265 


5051 


6837 


784CIP2C_68 


3615 ' 


1480 


3266 


5052 


6838 


784CIP2C_69 


3623 


1481 


3267 


5053 


6839 


784CIP2C 70 


3629. 


1482 


3268 


5054 


6640 


784CIP2CJ71 


3666 


1483 


3269 


5055 


6841 


784CIP2C_72 


3667 


1484 


3270 


5056 


6642 


784CIP2C_73 


3906 


1485 


3271 


5057 


6843 


784CIP2C 74 


3912 j 
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SEQ ID NO: 


SEQ ID 


SEQ ID NO: 


SEQ ID 


Priority 


SEQ ID 


of full- 


NO: of 


of con tig 


NO: 


docket nuiiibe r_ 


NO:in 


length 


full- 


nucleotide 


of con tig 


corre spending 


U.S .S.N. 


nucleotide 


length 


sequence 


peptide 


SEQ ID NO: in 


09/488,725 


sequence 


peptide 
sequence 




sequence 


priority 
application 




1486 


3272 


5058 


6844 


784CIP2C_75 


3924 


1487 


3273 


5059 


6845 


784CIP2C_76 


3928 


1483 


3274 


5060 


6846 


784CIP2C 77 


3935 


1489 


3275 


5061 


6847 


784CIP2C_78 


3959 


1490 


3276 


5062 


6848 


784CIP2C_79 


3981 


1491 


3277 


5063 


6849 


784CIP2C_80 


3989 


1492 


3278 


5064 


6850 


784CIP2C 81 


4295 


1493 


3279 


5065 


6851 


784GIP2C_82 


4300 


1494 


3280 


5066 


6852 


784CIP2C_83 


4360 


1495 


3281 


5067 


6853 


\ 784CIP2C_84 


4362 


1496 


3282 


5068 


6854 


784CIP2C_85 


4371 


- 1497 


3283 


5069 


6855 


784CIP2C_86 


4373 


1498 


3284 


5070 


6856 


784CIP2C_87 


4376 


1499 


3285 


5071 


6857 


784CIP2C_89 


4378 


1500 


3286 


5072 


6858 


784CIP2C_90 


4382 


1501 


3287 


5073 


6859 


784CIP2C_91 


4409 


- 1502 


-3288 


5074 


6860 


784CIP2C_92 


4421 


1503 


3289 


5075 


6861 


784CIP2C_93 


4421 


1504 


3290 


5076 


6862 


784CIP2C_94 


4426 


1505 


3291 


5077 


6863 


784CIP2C_95 


4430 


1506 


3292 


5078 


6864 


784CIP2C_96 


4435 


1507 


3293 


5079 


6865 


784CIP2C_97 


4436 


1508 


3294 


5080 


6866 


784CIP2C_98 


4439 


1509 


3295 


5081 


6867 


784CIP2C_99 


4440 


1510 


3296 


5082 


6868 


784CIP2C_100 


4441 


1511 


3297 


5083 


6869 


784CIP2C_101 


4442 


1512 


3298 


5084 


6870 


784CIP2C_102 


4455 


1513 


3299 


5085 


6971 


784CIP2C_103 


4462 


1S14 


3300 


5086 


6872 


784CIP2C_JL04 


4466 


1515 


3301 


5087 


6873 


784CIP2C_105 


4469 


1516 


3302 


5088 


6374 


784CIP2C_106 


4477 


1517 


3303 


5089 


6B75 


7B4CIP2C 107 


4481 


1518 


3304 


5090 


6376 


784CIP2C_108 


4483 


1519 


3305 


5091 


6877 


784CIP2C_109 


4484 


1520 


3306 


5092 


6878 


784CIP2C_110 


4486 


1521 


3307 


5093 


6379 


784CIP2C 111 


4490 


1522 


3308 


5094 


6880 


784CIP2C_112 


4499 


1523 


3309 


5095 


6381 


784CIP2C_113 


4503 


1524 


3310 


5096 


6882 


784CIP2C_114 


4506 


1525 


3311 


5097 


6883 


784CIP2C_115 


4509 


1526 


3312 


5098 


6884 


784CIP2C_116 


4514 


1527 


3313 


5099 


6885 


784CIP2C_117 


4516 


1528 


3314 


5100 


6886 


784CIP2C_118 


4522 


1529 


3315 


5101 j 


6887 


784CIP2C_119 


4525 


1530 


3316 


5102 


6888 


784CIP2CJL20 


4527 


1531 


3317 


5103 


6889 


784CIP2C_121 


4528 


1532 


3318 


5104 


6890 


784CIP2C_122 


4529 


1533 


3319 


5105 


6891 


784CIP2C_12 3 


4532 


1534 


3320 


5106 


6892 


784CIP2C_124 


4537 


1535 


3321 


5107 


6893 


784CIP2C_125 


453 8 


1536 


3322 


5108 


6894 


784CIP2C_126 


4551 


1537 


3323 


5109 


6895 


784CIP2C_127 


4552 


1538 


3324 


5110 


6896 


7"84CiP2C_128 


4559 


1539 


3325 


5111 


6897 


784CIP2C_129 


4567 


1540 


3326 


5112 


6898 


784CIP2C_13 0 


4568 


1541 


3327 


5113 


6899 


784CIP2C__132 


4585 


1542 


3328 


5114 


6900 


784CIP2C 133 


4592 


1543 


3329 


5115 


6901 


784CIP2C_134 


4609 


1544 


3330 


5116 


6902 


784CIP2C_135 


4616 


154 5 


3331 


5117 


6903 


784CIP2C_136 


4617 


1546 


3332 


5118 


6904 


784CIP2C 137 


4618 


1547 


3333 


S119 


6905 


784CIP2C 138 


4620 
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ceo Tr> NO - 
of full- 
length 
nucleotide 
sequence 


NO : of 
full- 
length 
peptide 
sequence 


S3Q ID NO: 
of con tig 
nucleotide 
sequence 


SEQ ID 
NO: 

of contig 

peptide 

sequence 


Priority 
docket number_ 
corr e spondi ng 
SEQ ID NO: in 
priority 
application 


SEQ ID 
NO: in 
U.S .S.N. 
09/488,725 


1548 


3334 


5120 


6906 


784CIP2C_139 


4624 


1549 


3335 


5121 


6907 


7 84CIP2C_140 


4632 


15S0 


3336 


5122 


6908 


784CIP2C_141 


4634 


1551 


3337 


5123 


6909 


784CIP2C_142 


4638 


1552 


3338 


5124 


6910 


784CIP2C_143 


4639 


1553 


3339 


5125 


6911 


784CIP2C_144 


4643 


1554 


3340 


5126 


6912 


784CIP2C_145 j 


4644 


1SS5 


3341 


S127 


6913 


784CIP2C 146 


4655 


1556 


3342 


5128 


6914 


784CIP2C_147 


4668 


1557 


3343 


5129 


6915 


784CIP2C_148 


4677 


1558 


3344 


5130 


6916 


784CIP2C_149 


4677 


1559 


3345 


• 5131 


6917 


784CIP2C_150 


4677 


1560 


3346 


5132 


6918 


784CIP2C_152 


4682 


1561 


3347 


5133 


6919 


784CIP2C 153 


4690 


15^2 


3348 


5134 


6920 


7 84CIP2C__154 


4691 


1563 


3349 


5135 


6921 


784CIP2CJ.55 


4727 


1564 


3350 


5136 


6922 


784CIP2C_156 


4730 


1565 


3351 


5137 


6923 


784CIP2C_157 


4734 


1566 


3352 


5138 


6924 


784CIP2C 158 


4757 


1567 


3353 


5139 


6925 


784CIP2C 159 


4764 


1568 


3354 


5140 


6926 


784CIP2C_160 


4786 


i 1569 


3355 


5141 


6927 


784CIP2C 161 


4793 


1570 


3356 


5142 


6928 


784CIP2C 162 


4825 


1571 


3357 


5143 


6929 


784CIP2C_163 


4826 


1572 


3358 


5144 


6930 


784CIP2C_164 


4850 


1573 


" 3359 


~" 5145 


"~" 6931 


784CIP2C 165 


4853 


1574 


3360 


5146 


6932 


784CIP2C_166 


4855 


1575 


3361 


5147 


6933 


784CIP2C__167 


4856 


1576 


3362 


5148 


6934 


784C1P2C 168 


4867 


1577 


3363 


5149 


6935 


784CIP2C 169 


4869 


1578 


3364 


5150 


1 6936 


784CIP2C_l/0 


4878 


1579 


3365 


5151 


6937 


784CIP2C_171 


4880 


1580 


3366 


5152 


6938 


784CIP2C 172 


4942 


1581 


3367 


5153 


6939 


784CIP2C_173 


4945 


1582 


3368 


5154 


6940 


784CIP2C_174 


49S0 


1583 


3369 


5155 


•6941 


784CIP2C 175- 


4952 


1584 


3370 


5156 


6942 


784CIP2C_176 


4954 


1585 


3371 


5157 


6943 


784CIP2C_177 


4958 


1586 


3372 


5158 


6944 


784CIP2C_178 


4961 \ 


1587 


3373 


5159 


6945 


784CIP2C_179 


5590 


IS 8 8 


3374 


5160 


6946 


784CIP2C 180 


5599 


1589 


3375 


5161 


6947 


784CIP2C 181 


5692 


1590 


3376 


5162 


6948 


784CIP2C_182 


5732 


1591 


3377 


5163 


6949 


784CIP2C 183 


5765 


1592 


3378 


5164 


6950 


784CIP2C_184 


5771 


1593 


3379 


5165 


6951 


784CIP2C 185 


5774 


1594 


3380 


5166 


6952 


784CIP2C 186 


5793 


1595 


3381 


5167 


6953 


784CIP2CJL87 


5806 


1596 


3382 


5168 


6954 


784CIP2C_188 


5852 


v 1597 


3383 


5169 


6955 


784CIP2CJL89 


5892 


1598 


3384 


5170 


6956 


784CIP2C 190 


6057 


F 1599 


3385 


5171 


6957 


784CIP2C 191 


6061 


\ 1600 


3386 


5172 


69S8 


784CIP2C 192 




1601 


3387 


5173 


6959 


784CIP2C_193 


6160 


1602 


3388 


5174 


6960 


784CIP2C 194 


6297 


1603 


3389 


5175 


6961 


784CIP2C 195 


63 98 


1604 


3390 


5176 


6962 


784CIP2C 196 


6398 


1605 


3391 


5177 


6963 


784CIP2C_197 


6415 


1606 


3392 


5178 


6964 


784CIP2C_198 


6448 


1607 


3393 


5179 


^ 6965 


784CIP2C 199 


6469 


1608 


3394 


5180 


6966 


784CIP2C 200 


6476 


1609 


3395 


5181 


6967 


784CIP2C 201 


6561 j 
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SEQ ID NO : 


SEQ ID 


SEQ ID NO: 


SEQ ID 


Priority 


SEQ ID 


of full- 


NO: of 


of con tig 


NO: 


docket namber_ 


NO: in 


length 


full- 


nucleotide 


of contig 


corre spond i ng 


U.S. S.N. 


nucleotide 


length 


sequence 


peptide 


SEQ ID KO: in 


09/488, 725 


sequence 


peptide 




sequence 


priority 






sequence 






application 




1610 


3396 


5182 


6968 


784CIP2C_2 02 


6574 


1611 


3397 


| 5183 


6969 


784CIP2C_203 


6578 


1612 


3398 


5184 


6970 


784CI P2C__204 


6662 


1613 


3399 


5185 


6971 


784CIP2C_205 


6672 


1614 


3400 


5186 


6972 


784CIP2C_2 06 


6691 


1615 


3401 


5187 


6973 


784CIP2C_207 


6695 


1616 


3402 


5188 


6974 


784CIP2C_208 


6746 


1617 


3403 


5189 


6975 


784CIP2C_209 


6898 


1618 


3404 


5190 


6976 


784CI?2C_210 


6938 


1619 


3405 


5191 


6977 


784CIP2C_211 


6943 


1620 


3406 


5192 


697B 


7B4CI?2C_212 


7110 


1621 


3407 


5193 


6979 


784CIP2C_213 


7200 


1622 


3408 


5194 


6980 


784CI?2C_214 


7212 


1623 


3409 


5195 


6981 


784CI?2C__21S 


7218 


1624 


3410 


5196 


6982 


784CIP2C_216 


7249 


1625 


3411 


5197 


6983 


784CIP2C 217 


7500 


1626 


3412 


5198 


6984 


784CIP2C_218 


7509 


1627 


3413 


5199 


6985 


784CIP2C_219 


7523 


162B 


3414 


5200 


6986 


784CIP2C_220 


7544 


1629 


3415 


5201 


6987 


784CIP2C_221 


7564 


1630 


3416 


5202 


6988 


784CIP2C_222 


7568 


1631 


3417 


5203 


6989 


784CIP2C_223 


7631 


1632 


3418 


5204 


6990 


784CIP2C_224 


7813 


1633 


3419 


5205 


6991 


784CIP2C 225 


7831 


1634 


3420 


5206 


6992 


784CIP2C 226 


7843 


1635 


3421 


5207 


6993 


784CIP2C_227 


7907 


1636 


3422 


5208 


6994 


784CIP2C_22S 


7943 


1637 


3423 


5209 


6995 


784CIP2C_229 


8175 


1638 


3424 


5210 


6996 


784CIP2C_230 


8216 


1639 


3425 


5211 


6997 


784CIP2C_231 


8225 


1640 


3426 


5212 


6998 


784CIP2C_232 


8271 


1641 


3427 


5213 


6999 


784CIP2C 233 


8397 


1642 


3428 


S214 


7000 


784CIP2C_234 


8466 


1643 


3429 


5215 


7001 


784CIP2C 235 


8503 


1644 


3430 


5216 


7002 


784CIP2C_236 


8 953 


1645 


3431 


5217 


7003 


784CIP2C_237 


9106 


1646 


3432 


5218 


7004 


784CIP2C_238 


9139 


1647 


3433 


5219 


7005 


784CIP2C_239 


9555 


1648 


3434 


5220 


7006 


784CIP2C_240 


9650 


1649 


3435 


5221 


7007 


784CIP2C_241 


9889 


1650 


3436 


5222 


7008 


784£iP2C_242 


9933 


1651 


3437 


5223 


7009 


784CIP2C_243 


9953 


1652 


3438 


5224 


7010 


784CIP2C_244 


9981 


1653 


3439 


5225 


7011 


784CIP2D 1 


746 j 


1654 


3440 


5226 


7012 


784C1P2D_2 


3558 


1655 


3441 


5227 


7013 


784CIP2D_3 


3553 


1656 


3442 


5228 


7014 


784CIP2D_4 


3633 


1657 


3443 


5229 


701S 


784C1P2D_5 


3658 


1658 


3444 


5230 


7016 


784CIP2D_6 


3732 


1659 


3445 


5231 


7017 


7B4CIP2D_7 


4004 


1660 


3446 


5232 


7018 


784CIP2D_8 


4700 


1661 


3447 


5233 


7019 


784CIP2D9 


4703 


1662 


3448 


5234 


7020 


784CIP2D 10 


4 774 


1663 


3449 


5235 


7021 


784CIP2D_11 


4 894 


1 1664 


3450 


5236 


7022 


784CIP2D__12 


4918 


1665 


3451 


5237 


7023 


784CIP2D__13 


5159 


1666 


3452 


5238 


7024 


7B4CIP2D_14 


7443 


1667 


3453 


5239 


7025 


784CIP2D__15 


8673 


! 1668 


3454 


5240 


7026 


784CIP2D_16 


8679 


1669 


3455 


5241 


7027 


784CIP2D 17 


8727 


1670 


3456 


5242 


7028 


784CIP2D_18 


8734 


1671 


3457 


5243 


7029 


784CIP2D 19 


8756 



297 



BNSDOCtO: <WO. 



0153312A1_1_> 



WO 01/5331 2 



PCT/US00/34263 



epfl ID NO ■ 

of full- 


SEQ ID 
NO : Of 


SEQ ID NO: 
of contig 


SEQ ID 
NO: 


Priority- 
docket number_ 


SEQ ID 
NO: in 


length 


full- 


nucleotide 


of contig 


corre spondi ng 


U.S.S .2J. 


nucleotide 


length 


sequence 


peptide 


SEQ ID NO: in 


09/488.72S 


sequence 


peptide 
sequence 




sequence 


priority 
application 




1672 


34S8 


5244 


7030 


784CIP2D20 


8618 


1673 


3459 


5245 


7031 


784CIP2D_21 


8644 


1674 


3460 


5246 


7032 


784CIP2D_22 


8846 


1675 


3461 


524 7 


7033 


784CIP2D23 


8912 


1676 


3462 


5248 


7034 


784CIP2D 24 


8916 


1677 


3463 


5249 


7035 


784CIP2D_25 


8918 


1678 


3464 


5250 


7036 


784CIP2D_26 


8941 


1679 


3465 


5251 


7037 


784CIP2D_27 


8941 


1680 


3466 


5252 


7038 


784CIP2D_28 


8951 


1681 


3467 


5253 


7039 


784CIP2D_29 


8951 


1682 


3468 


5254 


7040 


784CIP2D 30 


9007 


1683 


3469 


5255 


7041 


784CIP2D_31 


9012 


1684 


3470 


5256 


7042 


784CIP2D32 


9013 


1685 


3471 


5257 


7043 


784CIP2D_33 


9025 


1666 


3472 


5258 


7044 | 


784CIP2D_34 


9053 


1687 


3473 


5259 


7045 


784CIP2D_35 


9054 


1688 


3474 


5260 


7046 


7 84CIP2DJ36 


9054 


1689 


3475 


5261 


7047 


784CIP2D_37 


9113 


1690 


3476 


5262 


7048 


784CIP2D_38 


9134 


1691 


3477 


5263 


7049 


784CIP2D__39 


9152 


j 1692 


3478 


5264 


7050 


784CIP2D40 


9152 


1693 


3479 


5265 


7051 


784CIP2D_41 


9211 


1694 


3480 


5266 


7052 


784CIP2D_42 


9223 


1695 


3481 


5267 


7053 


784CIP2D_43 


9223 


1 1696 


3482. 


5268 


7054 


784CIP2D_44 


9231 


1697 


3483 


5269 


7055 


784CIP2D_45 


9236 


1698 


3484 


5270 


7056 


784CIP2D_46 


9236 


1699 


3485 


5271 


7057 


784CIP2D_47 


9303 


1700 


3486 


5272 


7058 


784CIP2D_4 8 


9309 


1701 


j 3487 


5273 


7059 


784CIP2D 49 


9314 


1702 


3488 


5274 


7060 


784CIP2D__50 


9326 


1703 


3489 


5275 


7061 


784CIP2D_S1 


5335 


1704 


3490 


5276 


7062 


784CIP2D_52 


9348 


1705 


3491 


5277 


7063 


784CIP2D_53 


9376 


1706 


3492 


5278 


7064 


784CIP2D_S4 


9382 


1707 


3493 


5279 


7065 


784CIP2D_55 


94 07 


1708 


3494 


5280 


7066 


784CIP2D_56 


9414 


1709 


3495 


5281 


7067 


7 84CIP2D_57 


9439 


1710 


3496 


5282 


7068 


784CIP2D_58 


5485 


1711 


3497 


5283 


7069 


784CIP2D_59 


94 93 


j 1712 


3498 


5284 


7070 


784CIP2D_60 


9501 


1713 


3499 


5285 


7071 


784CIP2D 61 


952£ 


1714 


3500 


5286 


7072 


784CIP2D_62 


9526 


171S 


3501 


5287 


7073 


784CIP2D_63 


9551 


| 1716 


3502 


5288 


7074 


784CI?2D_64 


9557 


1717 


3503 


5289 


7075 


784CIP2D_65 


9568 


1718 


3504 


5290 


7076 


784CIP2D 66 


9588 


1719 


3505 


5291 


7077 


784CI?2D_67 


9597 


1720 


3506 


S292 


7078 


784CIP2D 68 


9615 


1721 


3507 


5293 


7079 


784CIP2D_69 


9628 


1722 


3508 


5294 


7080 


784CIP2D_70 


9649 


1 1723 


3509 


5295 


7081 


784CIP2D 71 


9652 


1724 


3510 


5296 ~l 


7082 


784CIP2D_72 


9660 


1725 


3511 


5297 


7083 


784CIP2D_73 


9662 


1726 


3512 


5298 


7084 


784CIP2D 74 


9725 


1727 


3513 


5299 


7085 


784CIP2D75 


9746 


1728 


3514 


5300 


7086 


784CIP2D 76 


9777 


1729 


3515 


5301 


7087 


784CIP2D_77 


9787 


| 1730 


3516 


53 02 


7088 


784CIP2D_78 


9790 


1731 


3517 


5303 


7089 


784CIP2D_79 


9842 


1732 


3518 


5304 


7090 


784CIP2D_80 


9842 


1733 


3515 


S3 OS 


7091 


784CIP2D_81 


9848 
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SEQ ID NO: 
ox ruii- 
■L eny c-ix 

Till iHQf T fjf* 


SEQ ID 
NO : or 
cuxx - 

•J. ciiy u. (l 

oeofc ic3e 
sequence 


SEQ ID NO: 
of contig 
nucieoc io& 


SEQ ID 
NO; 

of contig 
F>epcxae 
& c queue e 


Priority- 
docket number^ 
correspond ing 
SEQ ID NO: in 
priority 

J.La L x UJ I 


SEQ ID 
NO: in 
U. S -S _N . 
09/488 , 725 


1734 


3 520 


5306 


7092 




3ob r 


1735 


3521 


5307 


7093 


/ □ w.x rzi/ 03 


7 nni n 


1736 


3S22 


5308 


7094 




Xu LrX X 


1737 


3523 


5309 


7095 

J V w 




lUUb« 


1738 


3524 


5310 


7096 




XVSU3 r 


1739 


3525 


5311 


7097 


7fl4PTO^T> ft7 




1740 


3526 


53 12 


7QQQ 
* V J o 


7w/ir*T07r» oo 

> (aI/ 03 


7 r\ 7 "5 o 


1741 


3527 


5313 


70OQ 

t \J J u 


TftarTDon on 




1742 


3528 


5314 

■m* 


7100 




7 m c c 
XUXo3 


1743 


3529 


5315 

*-f tJV —T 


7101 


/o^LiriU 33 


7 f%7 71 
1U1 /3 


1744 


3530 


5316 


/ -L V £. 




7 r\"i *7"> 
1U1 /3 


174 5 


3 531 

W ~J -J J+ 


53 17 
_> x # 


T1fl3 


7 Q APTD7n OC 
'B4 V_X i* ^ U 33 


1 f>773 

1U2 /3 


1746 


3537 


531 a 


77 CiA 


T Q /l fT DIP 7 


; 3121 


1747 


■35*11 


m 7 Q 
33 »L 7 


/XU3 


/a4Clir2nf 2 


3628 


174 8 






7i n<; 


784CIP2E 4 


3 673 


174 9 


i 1 5 1 5 
3 JJ 3 


33 ■c.X 


/J.U / 


7 84C1P2E_3 


4018 


1750 


1 1^"^ 6 

J Jj D 


33 ^ ^ 


/xuo 


/o4C„P<JE b 


44 67 




-3 3o / 


33 ^o 


/XU J# 


784CxP2E 7 


4 865 


i 757 


"a C7 a 

3 DO a 




7110 


784CIP2E_8 


4916 


JL / 33 


1 tC"3 Q 


C 1 T C 


7111 


784CIP2E 9 


j 4 923 


1 fjH 


3 34 U 


5326 


7112 


784CIP2E_10 


4926 


1 7CC 


3 391 


532 / 


7113 


784CIP2E_11 


4962 


X / DO 


J 542 


C "1 O 

5328 


7114 


784CIP2E_12 


4963 


7 ~7C7 
-L /3 / 


3 543 


5329 


7115 


764CIP2E_13 


4964 


"1 ""7 CC D 
J- / DO 


1 C A A 

3 544 


5330 


7116 


784CIP2E_14 


4988 


i 1 /53 


3 545 


5331 


7117 


784CIP2E 15 


5835 


i *7 e n 
JL / bu 


3 546 


5332 


7118 


784CIP2E_16 


7682 


X / bX 


3 547 


5333 


7119 


784CIP2E_17 


7682 


X / k>jL 


i c a. a 
3 D4o 


5334 


7120 


7 8 4 CI P2E_1 8 


7699 


X /DJ 


"3 C /l O 

3 54 3 


cool: 
5335 


7121 


784CIP2E 19 


7707 


"7 7 


O 3D U 


333o 


/122 


784CIP2E 2 0 


7707 


1 / DD 


o Obi 


c i "a t 
3o3 / 


Tin 
/ 123 


784CIP2E 21 


7752 


X / O © 


o 3 bZ 


C13Q 

333 O 


/124 


784CIP2E 22 


O T [T 1 

B357 


X / O / 


"3 C C "3 

Jbo j 


r 1 "3 Q 

53 33 


/12b 


784CIP2E 23 


9065 


7 ~7 tza 
X / oo 


.5 354 


534U 


7126 


784CIP2E__24 


9324 


X / O 3 


3 333 


334X 


712/ 


784CIP2F_1 


2976 


i 77n 


J 33b 


334 £ 


•linn 

/12o 


7 84CIP2r__2 


3 rrn 


1 771 
J. f 1 X 


J 33 / 


334 J 


•7 1 O Q 
/123 


7 84CIP2r 3 


4 021 


7 779 


J JJU 


cm 
bj*4 


/ X_J u 


/ o4Ullr4ir 4 


44 /4 


1 771 




33 4 3 


/XoX 


/ O 4 L.1 Ir 2 r __5 


j< c a £1 
45bb 


1774 1 


RCA 


33 4 O 


/ 132 


/ o4t-ll'2r b 


4 /U5 




**t *"i<1 1 
3 30 X 


IT'S /l -T 


/133 


7Q/IPTD7D *7 

/o4L1P4 r / 


4 / U / 


1776 




c-ia a 

30 *> O 


71 14 
r Xol 




A "71 "5 
4 /XjS 


1777 


3563 


5349 


7135 


784CIP2F 9 


5008 


1778 


3564 


5350 


7136 


784CIP2F_10 


5009 


1779 


3565 


5351 


7137 


784CIP2F_11 


5015 


1780 


3566 


5352 


7138 


7 84CIP2?_12 


5015 


1781 


3567 


5353 


7139 


784CIP2F_13 


7724 


1782 


3568 


5354 


7140 


784CIP2F_14 


7725 


1783 


3569 


5355 


7141 


784CIP2F_15 


8828 


1784 


3570 


5356 


7142 


784CIP2F_16 


8830 


1785 


3571 


5357 


7143 


784CIP2F_17 


9739 


1786 


3572 


535B 


7144 


784CIP2F_18 


9896 
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TABLE 7 



SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid j 
sequence 


predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C-Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Histidine, I«Isoleucine, K«= Lysine, 
L= Leu cine , M=Methionine , N=Asparagine , 
P= Proline, Q=Glutamine, K=Arganine, 
S=Serine, T= Threonine, V= Valine, 
W= Tryptophan, Y=Tyrosine, X=Onknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 


S359 


337 

• 


1131 


AHI^ARI£ALILDEVAILPAPQNI>SVI^TN^1KHLIJWSPVIA^><3 
ETVYYS VEYQGEYESLYTSHIW I PS SWCSLTEGP3CDVTDD ITA 
TV P YNLRVRATLGSQTS /CLEHP /VS I PLI ETQPSLPDL/RMBI 
TKDG FHLVTELEDLG PQFEFLVAYWRRBPGAEEHVKMVRSGGI P 
VHLETME PGAAY CVKAQTFVKA I laK Y i> by 1 K^wfcvyvjtAJ.i'l* 
VrjU.FAWGFMLILVWPLFWKMGPJL.IiQ/YLLLPRGGSSQTPW 

KITQF 


S360 

• 


2 


1115 


PR VRS SGGQED P ASQQWARPRFTQP S KMRRRV I ARPVGS SVRLK 
C^ASGHPRPDITWMKDDQALTRPEAAEPRKKKvnrLSLKNIJ^PED 
SGKYTCRVSNRAGAINATYKVDVI QRTRSKP VLTGTHPVNTTVD 
FGGTTS FOCKVRSDVKPVIQWLKRVEYGAEGRHNSTIDVGGQKF 
VVLPTGDWSRPIXSSYIiNKLI^TRARQDDAGWICLGANTMGYS 
FRSAFLTVLPDPKPPGPPVASSSSATSLPWPWIGIPAGAVFIL 
GTLLLWLCQAQKKPCTPAP AP PLPGHRPPGTARDRSGD KDLPSL 
AALSAGPGVGLCEEHGSPAAPQHLLGPGPVAGPKLYPKLYTGHS 
TPHT YTHPPPS CQLNSSHS 


53 61 


3 


925 


' HEGSISSANILLDDQFQPiGbTDFAMAHFRSHLEHQSCTINMTSS 
SSKKLWYMPEE YIRQGKLS I XTDVYS FG I VI MEVLTG CR WLDD 
PKHIQLRDLLRELMEKRGLDSCLSFLDKKVPPCPRNFSAKLFCL 
AGRCAATRAKLRPSMDEVLNTLESTQASLYFAEDPPTSLKS FRC 
PSPLFLENVPS I PVEDDESQNNNLLPSDEGLRIDRMTQKTPFEC 
SQSEVMFLSLDKKPESKRNEEACNMPSSSCEESWFPKYIVPSQD 

LRP YKVNIDP S SEAPGHS CRS RPVES S CSSKFS WDE YEQYKKE 


5362 


2 


4879 

* 


SCQVEGCTRTYNSSQS IGKHMKTAHPDQ YAAFKMQRKS KKGQKA 
NNLNTPNNGKFVY FLPS PVNS SNPFFTS QTKANGNPACS AQLQH 
VSPPIFPAHLASVSTPIiLSSMESVINPNITSQDKN2QGGMLCSQ 
ME1JLPSTALPAQMEDLTKTVLPLNIDRGSDPFLSLPAESSSIDL 
FPSPADSGTNSVFSQLE1WTNHYSSQIEGNTNSSFLKGGNGENA 
VFPSQ VNVANK FSS TNAQQSAPEKVKKDRGRGQTGKERKPKHNK 
RAKKPAI IRDGIG?ICSRCYRAFTNPRSLGGHLSKRS YCKPLDGA 
EIAQELLQSNGQPSLLASMILSTNAVNLQQPQQSTFNPEACFKD 
PSFLQLLAENRSPAFLPNTFPRSGVTNFNTSVSQEGSE1I IQAI» 
ETAGI PSTFEGAEMLSHVSTGCVSDASQV^TVMPNPTVPPLLH 
TVCHPNTLLTNQNRTSNS KTSS I EECSSLPVFPTNDLLLKTVEN 
GhCSSS FPNSGGPSQNFTSNSSRVSVI SGPQNTRSSHLNKKGJTS 
AS KRRKKVAPPL I APNASQNLVTSDLTTMGLIAKSVE I PTTNLH 
SNVIPTCEPQS LVEI^TQKLNNVNNQLFMTDVKENFKTSLESHT 
VLAPLTLKTENGDS QMMALNS CTTSVNSDLQI S BDNV I QNFEKT 
LEI IKTAMNSQ I IiEVKSGSQGAGETSQNAQINYNI QLPSVNTVQ 
NNKLPDSSP\FSSFISVMPTESNIPQSE\VSHKBDQIQEILEGI» 
QKLKLEKDLS T PAS QCVL INTS VTLTP T P VKSTAD I TV I QP VS E 
M INIQFNDKVNKP F VCQNQGCNYSAMTKDALFKHYGKIHQYTPE 
M I LE I KKNQLKFAP F KC WPTCTKTFTRNS NLRAH CQ L.VHH FTT 
EEMVKLK I KRPYGRKSQSENVPAS RSTQVKKQLAMTEENKKESQ 
PALELRAETWTHSNV3VVIPEKQLIEIOCSPDKTF.SSLQVITVTS 
EQCOTNALTNTQTKGRKIRRHKKE KEE KKRKKPVS QS LEFPTR Y 
S PYRP YRCVHQG CFAAFTIQQNLI LHYQAVHKSDLPAFSAEVEE 
t? c-car^ ttp c pvppt trnTT ,KT?FRnf>V e 5DCSRIFOAITGLIQHYMKL 
HEMTPEEIESMTASVDVGKFPCDQLECKSSFTTYI^mnmLF^XJ 

HG IGLRAS KTEEDG VYKCD CEG CDR I YATRS NLLRH I FMKHND K 

HKAHLIRPRRLTPGQENMSSKANQEKSKSKHRGTKHSRCGKEGI 

KMPKTKRKKKNNLE^KNAKIVQIEENKPYSLKRGKHVys 

DALS ECTS RFVTQYP CMI KGCTS WTSESN I IRHYKCHKLS KAF 

TSQHRNL»LIVFKRCCNSQVKETSEQEGAKNDVKDSDTCVSESND 

NSRTTATVSQKEVEKNE* DEMDELTELFITKLINEDSTS VETQA 

OTSSr^SWDFQE^3tNLCQSERQKASNLKRVWKEKNVS QWKKRKVE 

KAE PASAAELSSVRKEEETAVAI QTI EBHPAS FDWSS F KPMGFE 
VSFLKFLEESAVKQIOCtm)KDHPNTGNKKGSHSNSRKNIDKTAV 
TSGNHVCPCKESETFVQFANPSQLQCSDNVKrVLDKNLKBCTEL 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corre spondi ng 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A= Alanine , C= Cysteine , D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine , G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
I*=Leucine, M-Methionine, N=sAsparagine, 
P= Proline . Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V^Valine, 
W=Tryptophan, Y=Tyrcsine, X=Unknown , *=Stop ' 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








VIjKQIiQEMKPTVSLKKLEVHSNDPDMSVMKDIS igkatgrgqy 


5363 

• 


8066 

• 


703 


RLCCTGGGEGTPGASGKRGPAATTSLVLCI PS VPP P VPFPTLWP 
PPSHRRQPPGGIRRDFSRRLPJIEANLVATCiPVRASLPHRIaNML 
RG PG PGLLLLAVLCLGTAVPSTG AS KSKRQAQQMVQPQS PVAVS 
QS KPGCYDNGKHYQ INQQWERTYLGNAL VCTCYGGSRG FNCES K 
PEAEBTCFD KYTGNTYRVGDT YERP KDSMI WDCTC I GAGRGR I S 
CTIANRCHEGGQSYKIGI>TWRRPHETG<3YMIiECVCIiGNGKGEWT 
CKP IAEKCFDHAAGTSYVVGETWEKPYC^VJMr^IXn'CLiGEGSGR 
I TCTS RNRCNDQDTRTS YRIGDTWS KKDNRGNL.LQ C I CTGNGRG 
EVJKCERHTSVQTTSSGSGPFTDVRAAVYQPQPHPQPPPYGHCVT 
DSGWYSVGMQLA* KTQGNKQML \ CTCLGNG VS CQ E TAVTC/1' VG 
GNSNGEPCVLPFTYNGRTFYSCTTEGRQDGHLWCSTTSNYECI>Q 
KYSPCTDHTVLVOTRGGNSNGAIiCHFPFIjYNNHNYTDCTSEKRR 
DNMKWCGTTQNYDADQKFGFC PMAAHEE I CTTNEGVMYR IGDQW 
DKQHDMGHMMRCTCVGNGRGEWTCIAYSQLRr>QCIVDDITYNVN 
DTFHKRHEEGHMLNCTCFGCXjRGRWKCDPVDQCQDS ETGT fyqi 
GDStTEKYVHGVRYQCYCYGRGIGEWHCQPIiQTYPSSSGPVEVPI 
TETPSQPNSHPIQWNAPQPSHISKYILRWRPKNSVGRWKEATIP 
GHLNSYTIKGLKPGWYEGQLIS IQQYGHQEVTRFDFTTTSTST 
PVTSNT\ VTGETTPFS PLVATSESVTEITAS SFWS WVSASDTV 
SGFRVEYELSEEGDEPQYIiVLPSTATSV\N I P\DLI,PGRKYI VN 
VYQ I S EDGEQS L ILSTSQTTAPDAP PDPTVDQVDDTS I WRWSR 
PQAPITGYRIVYSPSVEGSSTELNLPETANSVTLSDLQPGVQYN 
ITIYAVEENQESTPWIQQETTGTPRSDTVPSPRDLQFVEVTDV 
KVTIMWTPPESAVTGYRVDVIPVNLPGEHGQRLPLSRNTF\ABN 
TGI*S PGVTYY FKVFAVS HGRES KPLTAQQTTKlA DAPTNLQF VN 
ETDSTVLVRWTPPRAQITGYRLTVGLTRRGQPRQYNVGPSVSKY 
PLRNLQPAS EYTVSLVAI W3NQES PKATGVFTTLQPGS S I PPYN 
TErVTETTIVITWTPAPRIGFKLGVRPSQGGEAPREVTSDSGSIV 
VSGIjTPGVEYVYTI QVLRD3QERDAP \ TVNK\ WTPLS PPTNLH 
LEAWPDTGVLTVSWERSTTPDITGYRITTTPTNGQQGNSLEEVV 
HAIX2SSCTF\DNL»EVPGLEYNVSVYTVKDDKESVPISDTI ipav 
P PPTDLRFTN / I IjGPDTMRVTW \ AP P PS I DLTNFLVRY S PVKNE 

GRMLQSLS i fflsdnVawltnllpgteywsvssvyeqhestp 
\lrg rqktglds p\tg idfs \di ta\nsft\vhw\ iapra/tp I 

TGYRIR\KHPEHF\SGRPREDR\VPH3RNSIT1>TNI>TPGTEYW 
S IVAltNGREESPLLIGQQSTVSDVPRDLEVVAATPTSLLlNSWD 
APAVTVRYYRITYGETGGNSFVQEFTVPGSKSTATISGLKPGVT) 
YTI TVYAVTGRGDS PAS S KP I SI NYRTE I DKPSQMQVTD VQDNS 
ISVKWLPSSSPVTGYRVTTT\ PKNGPG\ PTKTKTAGPDQTEMTI 
EGLQ PTVEYWS VYAQNPSGESQPIjVOTAVTN'IDRPKGLAFTO V 

dvdsikiawespqgqvsryrvtysspedgihelfpapdgeedta 
ei^ijipgseytvsvvaihddmesqpligtqstaipaptdlkft 
qvtptsl»saqwtppnvqltgyrvrvtpkektgpmkeinlapdss 

SVVVSGLMVATKYEVSVYAIjKDTIiTSRPAQGVVTTLENVS pprr 

arvtdatettitiswrtktetitgfqvdavpangqtpiqrtikp 
dvrsytitglqpgtdykiylytlndnarsspwidastaidaps 

NLRFIoATTPNSLLVSWQPPRARITGYIIKYEKPGSPPREVVPRP 
RPGVTEAT I TGLBPGTE YTI YV 1 AIiKNNQ KS E PL IGRKXTDELP 
QLVTXPHPNIjHGPEII*DVPSTVQKTPFVTHPGYDTGNGIQIjPGT 
SGQQ PS VGQQM I FEEHG FRRTTPPTTATP I RHRPRP YPPNVGQE 
ALSQTTIS WAP FQDTS EYI ISCHPVGTDEEPLQFRVPGTSTSAT 
LTGLTRGAT YNI I VEALKDQQRHKVREEWTVGNS VNEGLNQPT 
DDS C F0PYTVSH YAVGDEWERMSESG FKLLCQCIjGFGSGHFRCD 
S SRWCHDNGVNYK I GE KWDRQG ENGQMMS CTCLGNGKGE FKCD P 
HEATCYDDGKTYHVGEQWQKEYLGAI CSCTCFGGQRGWRCDNCR 
RPGGEPS PEGTTGQSYNQYSQRYHQRTNTNVNCP IECFMPIjDVQ 
ADREDSRE 


5364 



6066 


703 


RI»CCTG<3GEGTPGASGKRGPAATTSLVXiCIPSVPPFVPFPTLWP 
P PSWRRQPPGG IRRDFSRRLRREANIiVATCLPVRASLPHRLNML 
RGPGPGLLI*IAVIiCLGTAVPSTGASKSKRQAQQMVQPQSPVAVS 
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SEQ 
ID 
NO: 



5365 



Predicted 

beginning 

nucleotide 

location 

cor re spondi ng 

to first 

amino acid 

residue of 

amino acid 

sequence 



Predxcted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



8066 



703 



Amino acid segment containing signal peptide 
(A= Alanine, OCysteine, D=A spar tic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I^Isoleucine, K=Lysine, 
L=Leucine , M=Me t hionine , N^Asparagine , 
P=Proline, Q»=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V^Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *-Stop 
Codon, /=possible nucleotide deletion, 
^possible nucleotide insertion) 



QS KPGCYDNG KHYQ INCX^WERTYl^GNALVCTCYGGSRG FNCES K 
PEAEETCFDKYTGNTY^VGDTYERPKDSMIWDCTCIGAGRGRIS 
CT IANRCHEGG QS Y KI GDTWRRPHETGGYMLECVCIjGNG KGEWT 
CKP IAEKCFDHAAGTS YVVGETWEKP YQGWMMVDCTCLGEGSGR 
ITCT5RNRCNDQDTRTSYR.IGDTWSKXDNRGNIi^ 
EWKCERHTS VQTTS SGSGPFTDVRAAVYQPQPHPQPPP YGHCVT 
DSGWYSVGMQLA* KTQGNKQMI>\CTCIiGNGVSCQETAVTQTYG 
GNSNGEPCVLP PTYNGRTFYSCTTEGRQDGHLWCSTTSNYEQDQ 
KYSFCTDHTVXVCyTRGGNSNGALCHFPFLYNNHNYTIXrrSEGRR 

DNMKWCGTTQN YDADQ KFG FCPMAAH2E I CTTNEGVMYR IGDQW 
DKQHDMGHMMRCTCVGNGRGEWTCIAYSQLRDQCIVDDITYNVN 
DT-FHKRHEEGHMLNCrrCFGO^GRWKCDPVDOCQDSETGTFYOI 

GDSWEKYVHGVRYQC Y CYGRG I GEWHCQPLQTY P S SSGP VEVF I 
TETPSQPNSHPIQWNAPQPSH1SKYIURWRPKNSVGRWKEATIP- 
GHLNS YT I KGLKPGWYEGQL1 S I QQYGHQEVTRFDFTTTS TST 
P VTSNT\ VTGETTP FS PLVATSESVTEITASS FWSWVS ASDTV 
SGFR\rEYELSEEGDEPQYLVLPSTATSV\NlP\DLLPGRKYIVN 
VYQISEIX5EQSLILSTSQTTAPDAPPDPTVDQVDDTSIVVRWSR 
POAPITGYRIVYSPSVEGSSTELNLPETANSVTLSDLQPGVQYN 
IT I YAVEENQESTP W I QQETTGTPRSDTVPS PRDLQ FVE VTDV 
KVTI MWTP PESAVTG YR VDVI PVNIiPG EHGQRLPI*S RNT F \ AEN 
TGLS P GVT Y Y FKVFAVS HGR ES KP LTAQQTTKL \ DAPTN1.Q FVN 
ETDS TVL VRWTP P RAQ I TGYRLTVGLTRRGQPRQYNVGPSVS KY 
PLRNLQPASE YTVSLVAI KGNQESPKATGVFTTlQPGSS IPPYN 
TEWETTrVITtOTPAPRIGFKliGVRPSQGGBAPREVTSDSGSIV 

VSGLTPGVEYVTTIQVIiRDGQERDAP \ IVNK\WTPI*SPPTNI*H 
LEANPDTGVLTVSWERSTTPDlTGYRITTTPTNGQQGNSIiEEVV 
HADQ S S CTF\ DNIiETVPGLE YNVS VYTVKDDKES VP I SDTII PAV 
PPPTDLRFTN/ILGPDTMRVTWVAPPPSIDLTNFLVRYSPVKNE 
GRMLQSLS I FFLS DM \AWI>TNLIiPGTEYWSVSS VYEQHES TP 
\I^GRQKTGI*DSP\TGJDFS\DITA\NSFT\VHW\IAPRA/TPI 
TGYRIR\HHPEHF\2GRPREDR\VPHSRNSITLTHI»TPGTEYVV 
SIVAI*NGREESPLLIGQQSTVSDVPRDI,EWAATPTSI,I J I\SWD 
APAVTVR YYR I TYGETGGNS PVQEFTVPGSKS TATI SGLKPGVD 
YTI TVYAVTGRGDSPASSKP ISINYRTEIDKPSQMQVTDVQDNS 
ISVKWLPSSSPVTGYRVTTT\PKNGPG\PTKrKTAGPDQTEMTI 
EGLQ P TVEYWS VYAQN P S G ESQ PLVQTAVTN IDRP KGIiAFTD V 
DVDS I KI AWES PQGQVSRYRVTYSSP EDGI HELFPAPDGEEDTA 
ELQGLRPGSEYTVSWALHDDMESQPLIGTQSTAIPAPTDLKFr 

QVTPTSLSAQWTPPNVQLTG YRVRVTPKEKTGPMKE INLAPDSS 
SVWSGLMVATTCYEVSVYAI*KDT1»TSRPA 

ARVTDATETT ITIS WRTKTETI TGFQVDAVP ANGQTP IQRTI KP 
DVRSYTITGI^PGTX>YKIYIjYTLNDNARSSPVVIDASTAIDAPS 

NLRFLATTPNS LLVSWQ P PRARITGYI IKYEKPGSPPREWPRP 
RPGVTEATITGLEPGTEYTIYVIALKNNQKSEPLIGRiOCTDELP 
QLVTLPHPNLHGPE IIJ5VPSTVQKTPFVTHPGYDTGNGIQLPGT 
SGQ^PSVGMMIFEEHGFRRTTPPTTATPIRHRPRPYPPNVGQE 
ALSQTTISWAPFQDTSEYIISCHPVGTDEEPIiQFRVPGTSTSAT 
LTGLTRGATYN I IVEALKDQQRHKVREEVVTVGNSVNEGLNQPT 
DDSCFDPYTVSHYAVGDEWERWSESGFKliliCQCLGFGSGHFRCD 
S S RW CHDNGVNY KI GE KWDRQGENGQMMS CTCLGNG KG E FKCD P 
HEATC YDDGKT YHVGEQWQKEYI/3AI CSCTCFGGQRGWRCDNCR 
RPGGEPS PEGTTGQS YNQYSQRYHQRTNTNVNCPI ECFMPLDVQ 

ADREDSRE 

RLCXnWGEGTPGASGKRGPAATTSLiVIiCIPSVPPPVPFPTLV/P 



PPSWRRQPPGG IRRDFSRRI.RREANLVATCLPVRAS LPHRIJ^ML 
RG PG PG LLLIAVL CIX3TAVP STGAS KS KRQAQQMVQ PQS P VAVS 
QSKPGCYDNGKHYQ INQQWBRTYIiGNAI*VCrCYGGSRGFNCES K 
PEAEETCFDKYTGNTYRVGDTYERPKDSMIWDCTCIGAGRGRIS 

CTIANRCHEGGQS YKIGDTWRRPHETGG YMLECVCLGNG KGEWT 
CKPIAEKCFDHAAGTSYWGETWEKPYQGWMMVDCTCLGEGSGR 
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SEQ 
ID 

NO: 


Predicted 

beginning 

nucleotide 

location 

cor re spond i ng 

to first 

amino acid 

residue of 

amino acid 

sequence 


J Predicted end 
I nucleotide 
1 location 
I corresponding 
| to first 
I amino acid 
[ residue of 
I amino acid 
1 sequence 


1 Amino acid segment containing signal peptide 

1 (A=Alanine, C=Cysteine, D=Aspartic Acid, E= 

I Glutamic Acid, F= Phenylalanine, G^Glycine, 

1 H=Histidine, I=Isoleucine, K=I>ysine, 

I L= Leucine, M=Methionine, N=Asparagine , 

I PaProline, Q=Glut amine, R-Arginine, 

I S=Serine, T=»Threonine , V«Valine, 

1 W=Tryptophan, Y« Tyrosine, X=UnJcnown, *=Stop 

Codon, /-possible nucleotide deletion, 
1 \=possible nucleotide insertion) 


| 5366 




1 ■* 


ITCreRNRCNDQDTRTS YRIGDTWSKKDNRGNJLIiQCI CTGNGRG 
| EWKCERHTSVQTTSSGSGPFTOVRAAVyQPQPHPOPPPYGHCVT 
DSGWYS VGMQLA* KTQGNKQMIi\CTCLGHGVS CCETAVTQTYG 
GNSKGEPCVliPPTYNGRTFYSCTTEGRQIXniljWCSTTSNYEQDQ 
KYS FCTDHTVL VQTRGGTJSNGALCHFPFIj YKMHNYTDCT EGRR 
DNMKWCGTTQNYDADQKFGFCPMAAHEE I CTTNEGVMYR IGDQW 
DKQHDMGHMMRCTCVGNGRGEWTCI AYSQI»RDQ CIVDD I TYNVN 
in'FHKRHEEGHMl^CTCFGQGRGRWKCDPVDQCQDSETGTFYQI 
GUSWEKYVHGVRYQCYCYGRGIGEWHCQPLQTYPSSSGPVBVFI 
TETPSQPNSHPIQWNAPQPSHISKYILRWRPKNSVGRWKEATIP 
GHLNS YTI KGLXPGWYEGQLI S IQQYGHQEVTRFDFTTTSTST 
j P VTSNT\ VTGETTPFS PLVATS ES VTEITASS F WSWVSASDTV 
SGFR VEYEI*SEEGDEPQrLVLPSTATS V\NIP \DLLPGRKYI VN 
VYQ ISEDGEQSLI LSTSQTTAPDAPPDPTVDQVDDTS I WRWSR 
PQAPITGYRIVYSPSVEGS5TELNLPETANSVTL3DLQPGVQYN 
I TI YA VEENQ E£ TP WT QQETTG TPRS DTVPS PRDLQ FVE VTDV 
KVTIMWTPPESAVTGYRVDV1 PVNI*PGEHGQRI*PLSRNTF\AEN 
TGLSPGVTyYFKVFAVSHGRESKPLTAQQTTKL\I>APTNLQFVN 
ETDSTVLVRWTPPRAQITGYRIiTVGLTRRGQPRQYNVGPSVSKY 
PLRNLQPASE YTVSIiVAl KGXQES PKATGVFTTLQPGSS I PP YN 
TEVTETTIVI TWTPAPR IGFKI/3VRPSQGGEAPREVTSDSGS I V 
VSGLTPGVEYVYTIQVLRDGQERDAP\IVNK\WTPLSPPTNLH 
I£ANPDTGVI*TVSWERSTTPDITGYRITTTPTNG0^3GNSLEEVV 
HA0QS SCTF\ DNLRVPGLEYMVS VYTn/KDDKES VPISDT I IPAV 
PPPTDIJiFTN/II^PDTMRVTW\APPPSXDI,TNFI>VRYSPVKNE 
j GI^bQSI^IFFl>SDN\^VVLT^LLPGTEYVVSVSSVYEQHESTP 

1 \lrgrq ktglds p \ tgidfs \dita\wsft\vhw\ iapra/tpi 
tgyrir\hhpehf\sgrpredr\vphsrnsitltni»tpgteyvv 
sivalngreespiiijigqqstvsdvprdlewaatptsliii\swd 
apavtvryyr i t ygetggns p vqeftvpgskstati sglkpgvd 
ytitvyavtgrgds pas s kp i s inyrte i dkpsqmqvtdvqdns 
isvkwlpssspvtgyrviriv\pkngpg\ptktktagpdqtemti 
eglqptveyvvs vyaqnps ge sqplvqtavtni drp kgliaftdv 
dvdsikiai-^spqgqvsryrvtysspedgihelfpapdgeedta 
ei^ijlpgseytvsvvaijffidmesopligtqstazpaptdi/kft 
qvtpts i*saqwtppnvqltg yr vrvtp kektgpmke inlapds s 
svwsgiimvatkywsvyalkdtltsrpaqgvvttlenvspprr 
arvtdatetti ti s wrtktet i tgfqvdavp angqtp iqrt i kp 
dvrsyt itglqpgtdyki ylytlndnarssp wi das tai daps 
nlrflattpnsltvswqppraritgyi i kyekpgspprewprp 
rpgvteatitglepgteytiyvialknnqksepligrkktdelp 
qlvtlphpnlhgpe ildvpstvqktpfvthpgydtgngiqlpgt 
sgqq ps vgqqm i feehgfrrttp pttatp i rhrpr p yp pnvgqe 
alsqtti3 wapfqdtse yi i s chpvgtdee plqfrvpgtsts at 

LTGLTRGAT YN 1 1 VEAI*KDQQRHKVREEVVTVGNS VNEGLNQPT 
DDSCFDPYTVSKYAVGDET^RMSESGFKLIjCOCLGFGSGHFRCD 
SSRWCHDNGVTQYKIGEKWDRC<3ENGQMMSCTCLGNGKGEFKCDP 

heatcyddgktyhvgeqwqkeylgaics CTCFGGQRGWRCDNCR 

RPGGEPS PEGTTGQS YNQYSQRYHQRTNTNWCP I ECFMPUDVQ 
ADREDSRE 




8066 


703 


RliCCTGGGEGTPGASG KRGPAATTS hVLC I PS VP P PVP F PTLWP 
PPSWRRQPPGGlRRDFSRraLRREANLVATCliPVRASLPHRLNML 
RGPGPGLLLLAVLCLGTAVPSTGASKSKRQAQQMVQPQS PVAVS 
QSKPGCYDNGKHYQINQQWERTYI/5NALVCTCYGGSRGFNCESK 
PEAEETCFDKYTGNTYRVGDTYERPKDSMIWDCTCIGAGRGRI S 
CTIANRCHEGGQSYKIGDTVniRPHETGGYMLECVCLGNGKGEWT 
CKPIAEKCFDHAAGTSYVVGETWEKPYQGWMMVDCTC^EGSGR 
ITCTSRNRCNDQDTRTSYRIGDTWSKKDNRGNLIjQCI CTGNGRG 
EWKCERHTS VQTTS SGSG PFTDVRAAVYQPQ PHPQP P P YGHCVT 
DSGWYSVGMQLA* KTQGNKQML \ CTCL.GNGVS CQETAVTQTYG 
GNSNGEPCVLPFTYNGRTFYSCTTEGRQDGH1.WCSTTSNYEQDQ 
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SEQ 
ID 

.HO: 


Predicted 
beg inn ding 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide \ 
location 
cor re spond ing 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A= Alanine, C^Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, Fa Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine. K=Lysine, 
L=Leucine , M=Methionine, N=Asparagine , 
P= Proline, Q=Glut amine, R=Arginine, 
S=»Serine, T=Threonine, V-Valine, 
W=Tryptophan, Y= Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








KYS FCTDHTVLVQTRGGNSNGAIjCHFPFLYNNHNYTDCTSEGRR 
I3NMKWCGTTQNYD ADQ K FG FCPMAAHE E I CTTNEGVM Y R I GDQW 
DKQHDMGHMMRCTCVGNGRGEWTCIAYSQLRXKX^IVDDITYNVN 
DTFHKRHEEGHMIjNCT C3?GQGRGRWKCDPVDQCQDSETGTFYQ I 
GDSWEKYVHGVRYQCY CyGRGIGEWHCQPLQTYPSSSGPVEVFI 
TBTPSQPNSHP IQWNAPQPSH ISKYI LRWRPKNSVGRWKEATI P 
GHLNSYTI KGLKPGWYEGQLIS I QQYGHQEVTRFDFTTTSTST 
PVTSOT\VTGETTPFSPLVATSESVTEITASSFVVSWVSASDTV 
SG FR VE YEI»S EEGDE PQ YLVI*PSTATSV\N I P \DLLPGRKYIVN 
VYQ I SEDGEQSLI LSTSQTTAPDAPPDPTVDQVDDTS IWRWSR 
PQAPITGYR I VYS PSVEGS STEIiNLPSTANS VTLSDLQ PG VQYN 
ITIYAVEENQESTPWIQQETTGTPRSDTVPSPRDLQFVEVTDV 
K\TriMWTPPESAVTGYllVDVIPVNl*PGEHGORLPLSRNTF\AEN 
TGLSPGVTYYFKVFAVSHGRESKPLTAQQTTKL.\DAPTNLQFVN 
ETDSTVLVRVfTP PRAQ 1 TG YRIiTVGLTRRGQ PRQ YNVGPS VSKY 
PLRNLQPASE YTVSLVAI KGNQES PKATGVFTTLQPGSS I PPYN 
TEVTETTIVI TWTPAPRIGFKLGVRPSQGGEAPREVTSDSGS I V 

VSGLTPG VE YVYTIQVIiRDGQERDAP \ IVNK\ WTPLSPPTNLH 
LEANPDTGVLTVSWEPJSTTPDITGYRITTTPTNG<XK^NSI J EEVV 
I IADQS S CTF \ DNLEVPGLE YNVS VYTVKDDKES VP IS DTI I PAV 
PPPTDLRFTN/ILGPDTMRVTlJ\APPPSIDLTNFuVRYSPVKNE 
GRMLQSI^IFFIiSDN\AVVI,TNI»LPGTEYVVSVSSVYEQHESTP 
\LRGRQKTGI.DSP \TGIDFS \DITA\NSFT\VHW \IAPRA/TPI 
TGYRIR\HHPEHF\SGRPREDR\VPHSRNSITI»TNLTPGTEYW 
SIVALMGREESPL1.IGQQSTVSDVPRDLEVVAATPTSLLI \SWD 
APAVTVRYYRI TYGETGGNSPVQEFTVPGSKSTATISGtiKPGVD 
YTITVYAVTGRGDS PASS KP I S INYRTE IDKPSQMQVTDVQDNS 
I SVKWLPSSS P VTG YRVTTT\ PKNGPG\ PTKTKTAGPDQTEMTI 
EGIiQPTVE YWS VYAQNPSGESQ PLVQTAVTN I DRPKGLAFTD V 
DVDS I KIAWES PQGQVSRYRVTYSS PEDGIHBLFPAPDGEEDTA 
EIjQGLRPGSE YTVS WALHDDMESQPIj I GTQSTAI PAPTDLKFT 
QVTPTSLSAQWTPPNVQLTGYRVRVTPXEKTGPMKEINLAPDSS 
SVWSGIJ^VATKYEVSVYAliKDTI/TSRPAOGVVTTIiEWSPPRR 
ARVTDATETT I T I S WRTKTETITGFQVDAVPANGQTP IQRT I KP 
DVRSYTITGLQPGTDYKIYLYTLNDNARSSPWIDASTAIDAPS 

NLRFLATTPNSLLVSWQPPRARITGYI IKYEKPGS PPREWPRP 
RPGVTEATITGLEPGTETYT I YVI ALKNNQKSE PL IGRKKTDEIi P 
QLVT1»PHPNIjHGPEII^VPSTVQKTPFVTHPGYDTGNGIQLPGT 
SGQQPSVGQQMIFEEHGFRRTTPPTTATPIRHRPRPYPPNVGQE 
ALSQTTIS WAPFQDTSEY I IS CHPVGTDEEPLQFRVPGTSTSAT 
LTGLTRGATYN I IVEAI,KDQQRHKVREEWTVGNS VNEGLNQPT 
DDSCFT)PYTVSKYAVGDEWERMSESGFKLLCQCLGFGSGHFRCD 
SSRMCHDNGVirYKIGEKWDRG^ENGQMMSCTGLGNGKGEFKCDP 
HEATCYDDGKTYHVGEQWOKEYLGAICSCTCFGGQRGWRCIINCR 
RPGGEPS PEGTTGQS YNQYSQRYHQRTKTNVNCPIECFMPLDVQ 
ADREDSRE 


5367 


235 


3591 


KKI LNMLCKKN I V I EYXAD I LYEYX.YGFCFSGI KKYL 1 1 HVLRL 

ILELWMTRLLI£KSVSI,C/rQYIiIiI.IVKII*SWFPGKEMR^ 

EVMMRKQDS /RXVGNGSEQQLQKELADVLMDPPMDDQPGEKELV 

KRSQLDGEGDGPLSIJQLSASSTINPVPLVGLOKPEMSLPVKPGO 

GDSEASSPFTPVADEDSVVFSKIiTYtX3C3iSVNAPRSEVEALRNM 

SII^QCQISIJ3VTI^VPOTSEGIVRLLDPQTNTEIANYPIYKI 

LFCWGHDGTPESDCFAFTESHYNAELFRIHVFRCEIQEAVSRI 

LYSFATAFRRSAKQTPLSATAAPQTPDSDIFTFSVSLEIKEDDG 

KGYFSAVPKDKDRQCFKLRQGIDKKI VI YVQQTTNKEIiAI ERCF 
GIJjI*SPGKDVRNSDMHI»IjDIiESMGKSS15GKSYVTTGSWNPKSPH 
FQWNEET P KDKVLFMTTAVDLV I TEVQEPVR FLLET KVR VCS P 
NERLFWPFSKRSTTKN FFLKLKQI KQRERKNNTDTJbYEWCLBS 
ESERERRKTTASPSVRIiPQSGSQSSVTPSPPEDDEBEDNDEPLI* 
SGSGDVSKECAEKILETWGEI*I»SKWHl4NLNVRPKQLSSLVRNGV 
PEAI^EVWQIJ^CHOTroHIiVEKYRILITKESPQDSAITRDIN 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A= Alanine, C=Cysteine . D=Aspartic Acid, E= 
Glutamic Acid, F«Phenyl alanine, G=Glycine, 
HcHistidine, Idsoleucine, KsLysine, 
L^Leucine, tf=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S^Serine, T=Threonine, V= Valine, 
W=Tryptophan , Y=Tyrosine , X=Unknown , * =Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








RTFPAHDYFKmWDGQDSttYKICKAySVyDEEIGYCQGQSFIA 
AyiALHMPEEQAFSVLVXIMPDYGLRELFKQNFEDLH CK FYQLE 
RLMQEYI PDLYNHFLD I S LEAHM YASQ WFLTLFTAKFP I> YMVFH 
IIDLLLCEGISVXFNVTUiGLLKTS KDDLU iTDFEGALKFFRVQL 
PKRYRSEENAKKIWEIiACNMKISQKKliKIcyEKEYHTM^^ 
EDP I ERFERENRRLQEAN>n^EQENDDI*AHELVTS KIA3LR KDLD 
NAEEKADALNK3LLMTKQKLI DAEEEKRRLEEESAHLKKMCRRE 
LDKAESEIKKMSSIIGDYKQICSQIiSERLEKQQTANKVEIEKIR 
QKVDDCERCREFFNKEGRVKGISSTKEVLDEDTDEEKETLK^ 
REMELFXAQTKL\QLVEASCXIQD\LEHPF* GLPFNE\VQAA\ k 
KTWFNRTLSS I KTATGVQGKETC 


5358 


573 


2014 


RQAAG I PG I TPTEEKDGNLPD 1 VNSG SLHE FLVNLHERYG PWS 
FWFX3RRLWSIX3WDVI,KOHINPNKTr^/I,F*NHAEVIIKv-SIW 
WWQCE * KP\QRKKLYENGVTDSIiKSNFALliLKLPEELIiDKWIiSY 
PElXJH\VPLSQHMLGFAMKSvTQMV>lGSTFEDDQEVIRFQKNHG 
TVWSEIGKGFIiDGSLDKNMTRKKQ YEDALMQLES VLRN 1 1 KERK 
GRi^SQHIFIDSLVQGNLNDQQILEDSMI FSLASCI ITAKLCTW 
AI WFLTTSEEVQKKL YEE I NQVFGNGP VTPEKIEQLR YCQHVLC 
ETVRTAKLTPVSAQLQDI EGKIDRF I IPRETLVL YALGWXQDP 
1TTWPS PHKFI>PDRFiODELVMKTFS S LGFSGTQE CPELR F AYMVT 
TVLLS VL VTCRLHLLSVEGQVT ETKYELiVTS SREE AW I TVS KR Y 


5369 


1 


6622 


PRSLCFSLWAEAAV1iAIXMIjRRRRR1*I*RGTMSAS FVPNGASLED 
CHCNLFCLADLTG I KWKKYVWQGPTS AP I LFPVTEEDP I LSSFS 
RCLKADVLG/VWRRDQRPERREM** I FWGGEDP\VXiLiT1>FTMTY 
QKKKMECGRMDFPNlNAVLCFSKAvTii^ 

vkp yekdekp inksehiiscs ftfflhgdsnvctsve inqhqpvy 
i^eehiti^qqsnspfqviu:pfglngti>tgqafkmsdsatkk 
ligewkqfypiscclkemseekqed^weddslaavevlvagvr 
miypacfvlvpqsdiptpspvgsthcsssclgvhqvpastrdpa 

MSSVTLTPPTSPEEVQTVDPQSVQKWVKFSSVSDGFNSDSTSHH 
GGKI PRKIJA^^HVVDRVWQECNMNRAONKRKYSASSGGI*CE EATA 
AKVASWDFVEATQRTNCS CLRHKNLKS RNAGQQGQAPSLGQQQQ 
ILPKHKTNEKQEKSEKPQKRPLTPFHHRVSVSDDVGMD\ADS\A 
SQRI»V\ I SAP\ DSQ\ VRFS1^IR\TNDVAK\TPQMHGTEMANS PQ 
PPPI^P\HPCI>VVDEGVTKTPSTPQSQHFYQMPT?DPLVPSKPM 

EDRIDSLSQS FPPQYQEAVEPTVYVGTAVNLEEDEANIAWKYYK 
FPKKKDV^FliPPQLPSDKFKDDPVGPFGQESVTSVTELMVQCKK 
PLKVSDELVQQ YQ I KNQCLSAIAS DAEQ EPKIDP YAFVEGDEEF 
LFPDKKDRQNS E REAGKKHKVEDGTS S VTVLSKEEDAMS L»FS P S 
IKQDAPRPTSKARP PSTS L I YDSDLAVS YTDLDNL FNSDEDELT 
PGSKRSANGSDDKASCKBSKTGNLDPLSCISTADIjHKMYPTPPS 
LEQKIMGFSPMNMNNKEYGSMDTTPGGTVLEGNSSSIG 

VDEGFCSPKPSBI KDFS yvykpencqi lvgcsmfaplktlpsqy 
LPLIKLPEECIYRQSWTVGKLELLSSGPSMPFIKEGDGSNMDQE 
YGTAYTPCOTTSCX3MPPSSAPPSNSGAGI1*PSPSTPRFPTPRTP 
RTPRTPRGAGGPASAQGSVKYENSDLYS PASTPSTCRPLNS VEP 
ATVPSIPEAHSLYVNLII^ESVMISQ^FKDCNSDSCCICVCWMNIK 
GADVG V Y I PDPTQEAQ YRCTCG FSAVMNRKFGNNSGLFFEDELD 
I IGRNTDCGKEAEKRFEALRATSAEHVNGGLKES EKLSDDL ILL 
LQI>3CTNLFSrFGAADQDPFPKBGVISN>A^\^ERJX:CNDCYLA 
LEHGRQFMDNMSGGKVDEALVKS SCLHPWSKRNDVSMQCSQD IL 
RMLLSLQPVLQDAIQKKRTVRPWGVQGPLTWQQFHKMAGRGSYG 
TOESPEPLPIPTFLLGY1DYDYLVLSPFALPYWERLMI1EPYGSQR 
D IAYWLCPENEALLNGAKS V FRDLTAI YESCRLGQHRPVSRLL 
TDGIMRVGSTAS KKLSEKLVAEWFSQAAI)GN 
CKYDIXSPYIJ^LPLDSSLLSQPNLVAPTSQSLITPPQMTNTGNA 
OTPSATIASAASSTMTVTSGVAISTSVATANSTLTTASTSSSSS 
SNU^SGVSSNKLPSFPPFGSMNSNAAGSMSTQANTVQSGQLGGQ 
QTSALQTAGISGESS S LPTQPHPD VSES TMDRDKVG I PTDGDSH 
AVTYP PAI WYI IDP FTYENTDES TNS S S VWTLGLLR CFLEMVQ 
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SEQ 
ID 

NO: 


Predicted ~~T 
beginning 1 
nucleotide I 
location 1 
corresponding 1 
to first | 
amino acid | 
residue of 1 
amino acid 
sequence j 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A= Alanine, OCysteine, D-Aspartic Acid, E= 
Glutamic Acid, P- Phenyl alanine, G^Glycine, 
K»Histidine, X=Isoleucine, X= Lysine, 
L= Leucine , M=Methionine , N-Asparagine , 
P= Proline, Q=Glu t amine , R=Axginine, 
S-Serine, T=Threonine, V= Valine, 
W=Tryp topnan , Y=Tyrosine , X=Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








TLPPHIKSTVSVQIIPCQYLLQPVKHE2DREIYPQHLKSLAPSAP 

TOCRRPLPTSTNVKTLTGFGPGLAMETALRS PDRPECIRLYAPP 

FILAPVKDKQTELGETFGEAGQKYNVLFVGYCI>SHIX3RWII^C 

TDLYGELLBTCI IN I D VPNRARRKKS 3 ARJCFG LQKL W EW CLGL V 

QMSSLPWRWIGRI/5RIGHGELKDWSCLLSRRNLQSLSKRLKDM 

CRMCGISAADSPSILSACLVAMEPCGSFVIMPDSVSTGSVFGRS 

TTLNMQTSQLNTPQDTS CTH1LVFPTS AS VQVASATYTTENLDL | 

AFNPNNDGADGKGIFDLLDTGDDLDPDI INI LPASPTGSPVHSP 

R<iRVPHnGT5AGKGOSTDRLLSTEPHEEVPNILQQPLALGYFVST 

AKAGPLPDWFWSACPQAQYQCPLFLKASLHLHVPSVQSDELLHS 

KHSHPLDSNQTSDVLRFVLEQYNALSWLTCDPATQDRRSCLPIH 

FWLNQLYNFI MNML 


5370 


1226 | 


716 


RWSRKl^LRRAAQATBSRPPQSQEMHPPTGKEVHALKRLRDSAN 
ANDVETVQQLLEDGADPCAADDKGRTALHFAS CNGNDQ I VQLLL 
DHGADPNQR1>GIX3NTPLHLAACTNHVPV I TTLLRGGAS VDALDR 
AGRTPLHLAKSKLNILQEGHAQCLKAVR /HGGEADHP YAEGVSG 
APRAT*AARCSGVFPSPSRWLGSAPWSRSSCTIWSLPLHEAKCR 
-R\ro dt ccnjvrir*civi>QCQQfTf*TV<;TSliALAESLSLFRACTSLPVG 

GCISWL 


5371 


1331 


167 


IAAMLWKLLLRSQSCRLCS FRKMRS P PKYRP FLACFTYTTDKQS 
SKENTRTVEKLYKCSVDIRKIRR\*KX«YF*RMKPMLKKLRI/P 
LQELGADETAVAS I LERCP EATVCS PTAVNTQRKLWQLVCKNEB 
EL I KLI EQFP ES FFT1 KDQSNQKLNVQFFQELGLKNVV I S RLLT 
AAPNVFHNP VE KNKQMVRI LQES YLDVGGS EANMKVWLLKLLSQ 
Kt-nrrTT t MCTypR titcpt v vi ■rYBWTET Q W. 7 T/}LLSKLKGFLFOLC 
PRS IQNS ISFS KNAFKCTDHDLKQLVLKCPALLYYSVPVLEERM 
QGLLREGI S 1AQIRETPMVLELTPQI VQYR I RKLNSSG YRI KDG 
HLANLNGSKKEFEANFGKIQAKKVTIPLF^PVAPLNVEE 


5372 


51 


j 8S7 


' SPGAQFLWAAPDMPDPLFSAVQGKDEILHKALCFCPWLGKGGME 
Pt>RLLILLFVTELSGAHNTTVFG^V7AGQSIXJVSCPYDSMKHWGR 
RKAWCRQLGEKGP CQRWS THNLWLLS FLRRWNGSTAITDDTLG 
GTLT I TLRNLQPHDAGLYQ CQSLHGS E ADTLRKVLVEVLAD PLD 
HRDAGDLWFPG\DLRASRM?MWSTAS?GASWKEK3PSHPLPSFS 
SW PAS FSSRF* Q PAPSGLQ PGMDRS QGH I HPVNWTVAMTQG I SS 
KLCQG 


5373 


2814 


346 


" VKKT KS I FNS AMQ EMEVY VEN I RR KFGVFNYS P FRTP YTPNSQY 
QMLLD PTN PS AGT AXI DKQE KV KLNFDMTAS P KI LMS KP VLSGG 
TGERI SLSDMPRS PMSTNSSVHTGSDVEQDAEKKATSSHFSASE 
ESMDFLDKSTAS P AS TKTGQ AGS LSGSPKPFS PQLS AP I TTKTD 
KTSTTGSILNLNLDRSKAEMDLKELS ESVQQQSTPVPLIS PKRQ 
IRSRFQLNLDKTIESCKAQLGINEISEDVYTAVEHSDSEDSEKS 
DSSDSEYISDDEQKS*GTSQEDTEDKEGCQMDKEPSAVKKKPKP 
TNPVEIKEELKSTSPASEKADPGAVKDKASPEPEKDFSGKAKPS 

PKPTKDKLKGKDETDS PTVHLGLDSDSEXNELVT DLGEDHSGRE 

GRKNKKEPKEPSPKQDWGKTPPSTTVGSHSPPETPVLTRSSAQ 

TSAAGATATTSTSSTVTVTAPAPAATGSPVKKQRPLLPKE \ TAP 

AVQRSCGTSSTVQQKEITQSPSTSTITLVTSTQSSPLVTSSGSM 

STLVSSVNGDLPIGTASADVAADIAKYTSKL\MDAIKGTM\TEI 

Yl^I^KN\TTWKAQLAEDSOGIJlIEIEKLQWLHOX2BL\SEM 

LELTMAEMRQS WEQERDRL I AEVKKQLELEKQQAVDETKKKQWC 

ANFKKEAI FYCCWNTS YCD YPCQ\ QAHWPEH\MKS CTQSATAPQ 

\QEABAE\VNTETLNKSSQGSSSSTQSAPSETASA\SKEKETSA 

EKSKESGSTLDLSGSRETPSSILLGSNQGSDHSR\SNKSSWSSS 

DEKRGSXTRSDHN/TPSTQHGRSLLPGKESRAGTPFLGTSK 


5374 


2814 


346 


VKKTKS IFNSAMQEMBVYVENIRRKFGVFN YSPFRTPYTPNSQY 
QMT J.DPTNPS AGTAKIDKQEKVKLNFDMTASPKI IjMSKPVLSGG 
TGRR I S LSDMP RS PMSTNS S VHTGS D VEQDAE KKATS SHFS AS E 
ESMDFLDKSTAS PAS TKTGQAGSLSGSPKPFSPQLSAP I TTKTD 
KTSTTGSIIJ^LNLDRSKAEMDLKELSESVQQQSTPVPLISPKRQ 
IRSRFQIiNLDKTIESCKAQLGINEISEDVYTAVEKSDSEDSEKS 
DS S DS E YI S DDEQKS * GTS QEDTEDKEG<X3MpKEPS AVKKKPKP 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 

nucleotide 

location 

co rre spending 

to first 

amino acid 

residue of 

amino acid 

sequence 


Amino acid segment, containing signal peptide 
(A= Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine , G^Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L= Leucine , M=Me t hionine , N=Asparagine , 
P= Proline, Q^Glut amine, R=Arginine, 
S= Serine, T= Threonine , V=Valine, 
W=Tryptophan, Y= Tyrosine, X=Unknown , *-Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








TNP VEI KEELKSTS PAS £ KADPGAVKDKASPEPE KD FSG KAKPS 
PHP I KDKLKGKDSTDSPTVH3^GLDSDSE\NELVI DLGEDHSGRE 
GRKNKKEPKEPS PKQDWGKTPPSTTVGSHS PPETPVLTRSSAQ 
TS AAG ATATTS TS STVTVTAPAP AATGS PVKKQRPLLPKE \TAP 
AVQRS CGTSSTVCXJKEITQSPSTSTITLVTSTQSSPliVTS SGSM 
STIjVSSVNGDljPIGTASA0VAADIAKYTSKlj\MDAIKGTf4\TEI 
YNDLSKN\TTWKAQLAEDSCX3IjRIEIEKL0WIfHQQEI*\SEMKHN 
LELTMAEMRQSI^QERDRLIAEVKKQLEMKQQAVDETKKKQWC 
ANFKKEAIPYCCWNTSYCDYPCQ\QAHWPBH\MKSCTQSATAPQ 
\QEADAE\VNTETLNKSSQGSSSSTQSAPSETASA\SKEKETSA 
EKSKESGSTLDLSGSRETPSSII,IjGSNQGSDHSR\SNKSSWSSS 
DEKRGS\TRSDHN/TPSTQHGRSLt.PGKESRAGTPFLGTSK 


5375 


2907 


1116 


HIFliAEEEPMLERRCRGPLAMGPAQPRLI*SGPSQESPQTLGKES 
RGI>RO^GTSVA\QSGAQAPGRAHRCAHCRRHFPGWVA\IjWXiHTR 
RCQA/RGLPLPCPECGRRFRHAPFIALHRQVHAAATPDWGFACH 
IXXJQSFR^GWVALVLHLRAHSAAKAGPPACPKMARDAFWRRKAAS 
SS IIiRRCHPSRPRGPRPFI CGNCGRS I IiPTWDQ/IiKVAHKRVHV 
SRRP*ERGPPAXVFWGPRPRGPPTGDTPPGPGGDAVDRPF\QCA 
CCGKRFRHK\Pl^IRSHAACTSGERPHQ/CSRECG\KRFTNKPY 
LTS \HRRITHTARQP Y PC KE CGRR FRHKPNLLS H S KIHKRS EGS 
AQAAPGPGSPQLPAGPQESAAEPTPAVPLKPAQEPPPGAPPEHP 
QDP IEAP PS L YS CDDOGRS 7RLERFLRAHQRQHTGER PFTCAEC 
GKNFGKKTHLVAHSRVHSGERP FRLARKCGRRFLPRASQSGGRN 
SAEPNAPRFGP FVCPDCGKAFRHKP Y1AAHRPI ATPAEKPYVCP 
DCRKAFSQKSNL\VSHRRXHTGERPYACPDCDRSFSQKSNLITH 
RKSHIRDGAFCCAICGQTFDDEERIiIiAHQKKHDV 


5376 


4504 


591 


VSTFSLCLWPAGGGGRGRVSKMAQSKRHVYSRTPSGSRMSAEAS 
ARPLRVGSRVEV 1GKGHRGT VAY VGATLFATGKWVGVI LDEAKG 
KNDGTVQGRKYFTCDEGHGI FVRQSQ IQVFEDGADTTS PETPDS 
SAS KVLKREGTDTTAKTS KLRGLKP KKAPTARKTTTRRPKPTRP 
ASTGVAGASS SLGPSGSAS AGEI*S S SE PSTPAQTPLAAP 1 1 PTP 
VXiT S PGAVP P LPS PSKEEEGLRAQVRDLEEKLETLRLKRAEDKA 
KLKELEKHKIQLEQVQEWKSKWQEQQADIjQRRLKEARKEAKEAD 
EAKERYMEEMADTADAIEr4ATliDKEMAEERAESLQQEVEAliKER 
VDELTTDLEILKAEIEEKGSDGAASSYQLKQLEEQNARLKDAIiV 
RMRDI^SSEKQEHVK\LQKXMEKKNQELEVVRQQRERI^QEELSQ 
AESTlDEIJCEQVDAAIiGAEXMVEMLTDRNt^JLEEKVREIJiETVG 
DLEAMNEMNDBLQENARETEL»EI*REQLDMAGARVREAQKRVEAA 
QETVADYQQT I KKYRQLTAHLQDVNRELTNQQEAS VERQQQPPP 
ETFDFKI KFAETKAHAKAIEMELRQMEVAQANRHMSniiTAFMPD 
S FLRPGG DHDCVLVLLLMPRLI CKAEL I RKQAQEKFELS ENCS E 
RPGLRGAAGEQLSFAAIGLVY\SIiMPAAGHRYHRY* CHALSQCR 
IiD\VYKKVGSLYPEMSAHERSIjDFXIEI*I^nCDQLDETVNVEPI*T 
KAIKYYQHLYS IBiAEQPEI)CTMQIiADHIKFTQSAI»DCMSVBVG 
RJURAFLQGGQEATDIALLLRDLETSCS \ DIRQFCKK1RRRMPGT 
DAPGI PAALAPGPQVSDTLl^CRKHIiTWWAVLQEVAAAAAQlil 
APLAENEGLLVAALEEIiAFKAS EQJYGTPSSSP YECIiRQSCNIIj 
ISTMNK\LVTAMQEGEYDAERPPSKP PP \ VELRAAAIiRAE ITDA 
EGLG LKLE DRETV 1 KELKKSIjKI KGEELSEANVRLTLIjEKKLiDS 
AAKDADER I E KVQTRXfiETQALtLRKKE KEFEETMDALQAD I DQ L 
EAEKAELKQRLNSQSKRTIEGLRGPPPSGIATLVSGIAGEEQQR 
GAIPGQAPGSVPGPGI.VKDSPLLLQQ I S AMRLHISQLQHENS I L 
KGAQMKASLASLPPIjHVAiCI*SHEGPGSELPAGAI»YRKTSQI#ljET 
LNQLS THTHWDI TRTS PAAKS PSAQLMEQVAQLKS LSDTVE KL 
KDEVLKETVSQRPGATVPTDFATFPSSAFliRAKEEQQDDTVYMG 
KVTFSCAAGFGQRHRLVLTQEQliHQIjHSRIilS 


5377 


762 

• 


1106 


" DV^CKRVI,PAEAQEKGQLTLSCGESGEEG\F*YHEVRQAEGES* 
/wrcPNVRiVHTQLPrTKKPSGTLKAKFYLHTGSTKFAARISCTK 

SS* WPG YDGWWGGQYI FIFRGMRWEEQP 


5373 


2009 


664 


QASGTTl*RPLPDLPQI*KJ?REATSRNRAJjKPRGRl*VlkMTSCIjPAl* 
RFIATPRI*SAMPH1DNDVKIJ>FKDVIjI*RPKRSTIiKSRSEVDLTR 
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SEQ 
IE 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corre spending 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
<A= Alanine, C=Cysteine, D^Aspartic Acid, E= 
Glutamic Acid. F= Phenylalanine, G=Glycins, 
H=Histidine, I=Isoleucine, K= by sine, 
L=Leucine, ^Methionine, N=Asparagine, 
P=sProline. Q=Glutamine, R=Arginine, 
S=Serine f T=»Threonine , V»Valine, 
W= Tryptophan, Y=Tyrosine, X=Unknovn, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








" S FS FRNS KQTYSG VP I IAANMDT VGTFEMAKV LCKS * V PGS FWD 
VPQMGCVF1»IYKLFTI*KWKMLLI^VI*LPAS ILVAEKFS L FTAVH 
KHYS LVQWQEFAGQN PDCLEHLAASSGTGS SDFEQLEQI LEAI P 
QVKVICIlDVANGYSEHFVEFVKDVRKRFPQHTIMAGNVVTGE^^ 
EELILSGADI I KVG I G PGSVCTTR KKTG VG Y PQLSAVME CADAA 
KGliKGHI I SDGGCS CPGDVAKAFGAGAJ>FVMLGGMLAGHSESGG 
ELI ERDGKKYKLFYGMSS * I \ AM \ KKYAGG VAE YRAS EG KTVEV 
PFKGDVEHTIRDII*GGIRSTCTYVGAAKLKELSRRTTFIRVTQQ 

VNPIFSEAC 


5379 


2009 


664 


QASGTTLRPLPDLPQLKRRKATSimRALKPRGRLVLMTSCUPAL, 
RFIATPRLS AMPHI DNDVKLDFKDVLLR PKRS TLKS RS EVDLTR 
S FS FRNS KQTYSGTVP I IAANMDTVGTFEMAKVLCKS * V PGS FWD 
VPQMGCVPLIYFOliFTLKWKMLLLSVLLPASI^ 
KKYSIjVQWQEFAGQNPDCLEHLAASSGTGSSDFEQI*EQILEAI P 
Q VK Yl CLDVANG YS EH FVBFVKD VRKRFPQHTI MAGNWTGEMV 
EEL I LSGAD1 1 KVGIGPGSVCTTRKKTGVGYPQLSAVMECADAA 
HGLKGHI I S DGGCS CPGDVAKAFG AGADFVMLGGMLAGHS ESGG 
ELIERDGKKYKLFYGMSS * I \AM\KKYAGGVAE YRASEGKTVEV 
PFKGDVEHT I RD JUiGGI RSTCT YVGAAKLKELS RRTTFI RVTQQ 
VNPIFSEAC 


5380 


2 

» 


2050 


PSRAGGAERGRAAAARS PGGS AAGW ECPS VLDEAGACTMSSCVS 
SQPSSNRAAPQDEXGGRGSSSSESQKPCEALRGLSSLS IHLGME 
SFIVVTECEPGCAVDLGIiARDRPLEADGOEVPLDTSGSQARPHL 

SGRKLSLQERSQGGI*AAGGSLDMNGRCI CPSLPYSPVSSPQSS P 
RLPRRPTVESHHVS ITGMQDCVQLNQYTLKDEIGKGS YGWKLA 
YNEITONTYYAMKVLS KKXLIRQAAFPRRPPPRGTRPAPGGCIQP 
RGP I \EQVYQE I A\ I LKKLDH PNW \ KL VEVL\ DDPNE DHL YMV 
F\ ELVNQG P VMEVPTLKP LSEDQARF YFQDL I KGIEYLHYQKI I 
H\RDI KPSNLLVGBDGHI KIADFGVSNEFKGSDALLSNTVGTPA 
FMAPESLSETRXIFSGKALDVWAMGVTLYCFVFG* CP FMDERIM 
CLHSKIKSQALEFPDQPDIAEDLKDLITRMLDKNPESR I WPE I 
K1HPWVTRHGA2PLPSEDENCTLVEVTEE EVENS VKHIPSIATV 
ILVKTMIRKRSFGNPFEGSRRBERSLSAPGNLLTKKPTRECESI* 

SELKT* KISPLPACCKVT* EFPH PSGCRPS CWQP PFLHTHSQPR 
*PEPPRTDEALCPYETGRTCWAPLLCVLWWVGTPLPFPLSTSWL 

PDLVGAPGSHFCFLN"IALLRYNSHTM 


S381 


2 


2050 


PSRAGGAERGRAAAARS PGGSAAGWE CPS VLDEAGACTMSSCVS 
SQPSSNRAAPQDEIiGGRGSSSSESQKPCEALRGLSSLSlHIiGME 
SFIVVTECEPGCAVDI/3LARDPJLEADGQEVPIJ)TSGSQARPHL 
SGRKLSLQERSQGGLAAGGSLDMNGRCICPSLPYSPVSS PQSSP 
RLPRRPTVESHHVS ITGMQDCVQLNQ YTLKDEIGKGS YGWKLA 
YNENBNTYYAMKVIjS KKKLI RQAAFPRRP p PRGTRPAPGGC I QP 
RGP I \ EQVYQE IA\ II»KKLDHPNW\KIjVE VL\DDPNEDHLYMV 

f\ei,vncx;pwevptlkplsei>qarfyfqdlikgieylhyqkii 
h\rdi kpsnllvgedghikiadfgvsnefkgsdai*lsntvgtpa 
fmapeslsetrkifsgkaldvwamgvtlycfvfg+cpfmderim 
cijiskiksqalefl'dqpdiaedljcdlitrmldknpesrivvpei 
klhpwvtrhgaeplpsedenctlvevte^vensvkhipslatv 
ilvktmirkrs fgnpfegsrreerslsapgnlltkkptrecesl 
selkt * kisplpacckvt* efphpsgcrpscwqppflhthsqpr 
* pepprtoealcpyetgrtc^pllqvlwwvgtplpfplsts wl 

| PDLVGAPGSHFCFXNIALLRYNSHTM - 


5382 


1536 


203 


GARGS QQDAP ALQ EAEVRGP ERAQ P ARGRMTKAR L FRLW LVLGS 
VFMILLIIVYP7DSAGAAHFYLHTSFSRPHTGPPLPTPGPDRDRE 
LTADSDVDEFLDKFLSAGVKQSDLPRKETEQPPAPGSMEESVRG 
YDWSPRDARRSPDQGRQQAERRSVLRGFCANSSLAFPTKERP FD 

DIPNSELSHLIVDDRHGAIYCYVPKVACTN^ 
RGAPYRDPLRIPREHVHNASAHLTFNKFWRRYGKLSRHLMKVKL 

KRYTKFLFVRDPFVRLISAFRSKFELENEEF/ * PQVRRAHAAAV 
RQPHQPARI/5ARGLPRWPQ\VSFANFIQYIXDPHTEKLAPFNEH 

WRQVTRLCHPCQIDYDFVGKLETI^EI)AAQIJ^LL 
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SEQ 
ID 
NO: 



Predicted ' 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



5383 



45 



Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



5250 



5384 



136 



5385 



326 



5386 



326 



886 



799 



799 



5387 



2117 



Amino acid segment containing signal peptide 
fA=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=r Phenylalanine, G=Glycine, 
H=Histidine, I-Isoleucine, K=Lysine, 
L=Leucine, M=Kethionine, N-Asparagine, 
P=Proline, Q=Glut amine, R=Argi nine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=. Tyro sine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 



PELPGTGPPSSWEEDWFAKI PLAWRQQI/YKliYEADFVLFGVPKP - 
ENIiIiRD 



VERLLGCRNSKRTWRMLI S KNMP WRRL-QG ISFGMYSAEELKKLS 
VKS ITNPRYlJ>5IiGNPSANGLyDLALGPADSKEVCSTCVQDF£N 
CSGHLGHIELPLTVYNPLDFDKLYLLLRGSCLNCHMIiTCPRAVI 
HLL.LCQLRVLE VGAbQAV YELER I US RFLEENADPSAS E I REEL 
EQYTTEIVQNNL»LGSQGAHVKtrVCESKSK^ 

PHCKTGRSVVRKEHNSKLTITFPAMVHRTAGQKDSEPLGIEEAQ 
IGKRGyLTPTSAREHLSJVL»WKNEGFF 

PSVFFIJDPT,VVPPSP^RPVSRIX;DQMFTNGQTVNIiQAWKDVV^ 
IRKlJJUjMAQEQKLPEEVATPTTDEEKDSLIAIDRSFLSTLPGQ 
SLIDKLYN IW IRLQSH VN X VFDSEMDKLMMDKYPGIRQII*EKKE 
GLFRXHMMG KRVDYAARS VT CPDMY INTNE I G I PMVFATKLTYP 
QPVTPWNVQELRQAVING PNVHPGASMVINEDGS RTAL5 AVDMT 
QREAVAXQLLTPATGAPK^CXSTKIVC^HVKNGDIIJLiNRQPTLH 
RPSIQAHRARILPEEKVLRLHYANCKAYNADFIXSDEMNAHFPQS 
ELGRAEAYVLACTDQQYliV PKDGQ PLAGL I QDHMVSGAS MTTRG 
CFFTREHYMELVYRGLTDK^GTJVKLLSPSILKPFPLWTGKQVVS 
TLLINI I PEDHI PLNLSGKAKITGKAWVKETPRS VPGFNPDSMC 
ESQVI I REGE LLCGVLD KAHYGS S AYGLVH C CYE I YGGE TS G KV 
LTCXtARLFTAYlrQLYRG FTL3VED I LVKP KADVKRQR 1 1 EESTH 
CGPQAVRAALNliPEAASYDSVRGKWQDAHLGKDQRDFNMI DI>KF 
KEEVNH YSNE I NKAC1MP FGl*HRQFPENTLQIiMVQSGAKGSTVNT 
MQ I SCI »l jGQ I ELEGR S T PLMASGKS1*P CFEP YEFTPRAGGFVTG 
RFLTGIKPPEFFFHCMAGREGLVDTAVKTSRSGYXQRCI I KHLE 
GLWQYDLTVRDSDGSVVQFLYGEDGLDIPKTQFIiQPKQFPFTiA 
SNYBVIMKSQHLHEVljSRADPKKALHHFRAI KKWQS KHPNTLiUR 
RGAFLS YSQKI QEAVKALKLESENRNGR/RPWDS /G/RMI»RMWY 
ELDEES RRKYQKKAAACPDPSIiSVWRPDIY FAS VS ETFETKVDD 
YSQEWAAQTEKS YEKS EIiS1»DRLRTI*LQIi\KWQRS LCEPGEAVG 
liLAAQS IGEPSTOMTLNTFHFAGRGEMNVTLGI PRLREILMVAS 
ANIKTPMMSVPVLNTKKALKRVKSLKKQLTRVCLGEVLQKIDVQ 
ES FCMEEKQNKFQVYQI»R FQFLPHAY YQOEKCLRPEDI liRFMET 
RPFKLLMES IKKKNNKASAFRNVNTRRATQRDLDKAGELGRSRG 
EQEGDEEEEGHIVDAEAEEGDADASDAKRKEKQEEEVDYESEEE 
EEREGEENDDEDMQEERNPHREGARKTQEQDEEVGL /GH * GGPV 
PSRP PDAAPETHPQ PGAPGA\ EAMERRVQAVRE I HPFI DDYQYD 
TEESLWCQVTVKL PLMKINFDMSSLWSLAHGAVI YATKGI TRC 
LLNETTNNKNEKE1»VLNTEG XNIiPEL FTCYAEVLDLRRLYSND IH 
AI ANTYG I EAAIiRV I EKE I KDVFAVYG IAVDPRHI*S LVAD YMCF 
EGVYKPLNRFGIRSNSSPLQQMTFETSFQFLKQATMLGSHDBIiR 
S PSACLWGKWRGGTGLFEliKQPLR 



QSCGQRX,PTVL*L*GPPGSCPCI]^S1,F\PGRPHAI J PEIRPYINI 
TILKGDKGDPGPMGLPGYMGREGPQGBPGPQGSKGDKGEMGSPG 
APCQKRFFAFSVGRKTALHSGEDFQTLLFERVFVNLDGCFDMAT 
GQFAAPIiRG I YFFSLNVHS WWYKETYVHI MHNQKEAVI LYAQPS 
ERS IMQS QS VMLDIAYGDRVWVRLFKRQRENAI YSNDFDTYI TF 
SGHLIKAEDD 



LMVPRTIUCEAPAPPKAEAKAKA1j\KAKKAVLKDVKSHKKNKIHM 
SPTFRRPKTL*I^RQPKYPWKSTPRRNKLDHHVIIKFPLTTE-*A 
VXXIENNS IiliVFTVDVKANKHQI KQAVKK / LCD ID VA K VNTL I Q 
SDGERKAY VR LAPDYDALWATK2 GIT 



IjMVPRTKKEAPAPPKAEAKAKAL\KAKKAVIJG3VHSHK}CNKI 
S PTFRRP KTL* LRRQPKYPWKSTPRRNKLDHHVI I KFPLTTE *A 
VKKI ENNS LXi VFTVD VKANKHQ I KQAVKK/LCDIDVAKVNTLIQ 
SDGERXAYVRLAPDYDALWATKIGIT 



FWAASGGCWFVLGERRAGSLLSASYGTFAMPGMVLFGRRWAIA 
S DD L VF PG FFE L WRVLWWI G I LTL YLMHRG KLDCAGGALLS S Y 
LIVl^ILLAWICTVSAIMC^SMRGTIQfPGPRKSMSKLLYIRL 
ALFFPEMVWASI/3AAWVADGVQCDRTVVNG II ATVWSWI 1 1 AA 
TWS III VFDPLGGKMAPYSSAG PSHUDS HJDS S QLLNGLKTAAT 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corre sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E« 
Glutamic Acid, F= Phenyl alanine, G=»Glycine, 
H=Histidine, I=Isoleucine, K~ Lysine, 
L= Leucine , M=Metnionine, N^Asparagine, 
P= Proline, Q=Glutamine, R=Arginine , 
S=Serine, T=Threonine, V= Valine, 
W=Tryp tophan , Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\~possinle nucleotide insertion) 






• 


SVWETRIKLLCCCIGKDDHTRVAPSSTAELFSTYFSDTDDVPSD 
TAJVniJVT.TjHOOODMIRNNOI^AOVVCHAPGSSOEADLDAEIjKNC 
HHYMQFAAAAYGWPLYIYRNPLTGLCRIGGDCCRSKNPQTMT/M 
VGGDQLQL/CTSAP ILHTHRAAVQGLHPRQLPWTRFTELPFLVA 
IxDHRKESVVVAVRGTMSUJDVLTDIjSAESE 

KGISQAARYVYQRLINDGILSQAFSIAPEYRLVIVGHSLGGGAA 
AUiATMVRAAYPQVR CYAFS PPRGLWS KALQE YS QSF I VS LVLG 
KDVIPPJ^SVTNLEDLKRRIIiRVVT^CNK^ 

CM PNWT.PTFT.DGGDORVLTO PLLGEOS LLTRWS PAYS FS S DS PL 
DS S PKYPPLY PPGR 1 1 HLQBEGASGR FGCCSAAH YS AKW S HEAE 
FSKILIGPKMLTDHMPDILMRALDSVVSDRAACVSCPAQGVSSV 

DVA 


5388 


1569 


753 

• 


TADGGAGGGGRRQAG VRRHY LYP FTGG YRRRRAACQAER PAARS 
KDTDLAAYQKGNLG VQLRNMAQETNHS QVPMLCS TGCG FYGN PR 
TNGMCS VCYKEHT iQRQNSSNGR I S P PV QCTDGS VPEAQSALDST 

DTAQQPSEEQSKSLE\NRNKKRXAVSCAGRKWDLLGLNAGVEMF 
TVVYTVTOM YT IAL.T ITKQM L KNFVFQQE FKSFGS FHQQLLE YK 
ILEHLQTKN 


5389 


1569 


753 


tadggaggggrrqAgvrrhyLVpftggyrrrraacqaerpaars 

KDTDIJ^YQKGNIXSVQIjRNMAQETNHS QVPMLCS tg cg fygnp r 

tngmcsvcykehlqrqnssngrisppvqctdgsvpsaqsaldst 

SSSMQPSPVSNQSLLSESVASSQLDSTSVXIKAVPETEDVQASVS 

OTA(^PSEEQSKSLE\NRNKKRIAVSCAGRKWDLLGLiNAGVEMF 

*nnTV'nrpnMV*r T a T/T*TT*voMT.tntf TrvrP*fV''lRF , KS Ff5 S FHOOLLEYK 
i V V ] iv XS^rQX X Xnul A. 4. I\AJ i .1 iJ r>AN I: v r WvC<r rvo r vj ij l aww-u****-* * *» 

ILEHLQTKN 


5390 


217 


1332 


EDPRKLMBDKMWSECEGPEMSLVCLTDFQAHAREQLSKSTRDFI 
EGGADDS ITRDDNI AAFKRI RLRPRYLRDVSEVDTRTTIQGEEI 
SAPICIAPTGPHCLVWPDGEMSTARAAQAA\GI CYITSTFASCS 
LEDlVIAAPEGLRNFQLYVHPDLQIiNKQLIQRVESLGFKALVIT 
LDTP VCGNRRHDIRNQLR RNLTLTDLQS PKKGNAI PYFQMTP I S 
T SLCWNDLSWFQS I TRLP 1 1 LKGILTKEDAELAVKHNVQG 1 1 VS 
NHGGRQLDEVLAS I DALTEWAAVKGKI EVYLDGGVRTGNDVLK 
ALALGAKCIFLGDAI LWALAS KGEHG VKEVLN I LTNEFHTSMA\ 

r.TYV 1 © C VZV V/TP*^ R I* 
ij A\jV_2vo V >vc X I s " rCl J l_i v ^ xr jiui 


5391 


1 


1292 


VKKAAGRSRGPPTAGGQRCEEAPGTV>3ERRLGVRAWVKENRGSF 

QPPVCNKMIQEQLKVMFVGGPOTRKDYHIEEGEEVFYQLEGDM 

VLRVLEC^KHRDWIRQGEIFliLPARVPHSPQRFANTVGLVVER 

RRLETELDGIiRYYVGDTMDVLFElKWFYCKDIX3TQIA 

SEQYRTGKPIPDQI^KEPPFPIiSTRSIMEPMSLDAWLDSHHREL 

QAGT P IS LFGDTYETQ V IA YGQGS SEGLRQNVD W7LWQLEG S SV 

Vm^RPO*SU3PWMDSIXVI^WGPSY\AW\ERT<3GSVALSVT\Q 

D P ACKKS PWGEP S CHGLKAATG VP STLEVP SLPNNS P S PHYLSV 

Y CRCVPHRPAHCCH PPS CPSQPRCHAPGRAAAPHZiL WQTQPTAL 

PVLPGGLPPAPLLPIPLSLQTQCSTSTPRRPSIKAS 


5392 


1 


1623 

• 


I RGSNAQKWGASGSGGAG PQPDP AGPGG VPALAAAVLGACE PR 
CAAPC PLPALSRCRGAGSRGSRGGRGAAG SGDAAAAAEW I RKGS 
PTHKPAHGWIiHPDARVl^PGVSYWRYMGCIEVLRSMRSIJ^FlTr 
RTQVTREAINRLHEAVPGVRGS WKKKAPNKALAS VLGKSNLRFA 
GMSISIHISTTCI^LSVPATRQVIANHHMPSISFASGGDTDMTD 
YVAYVAKDPINQRACHI LECCEGL\AQS 1 1 STVGQAFELRFKQY 
LHSPPKVALPPERLAGPEESAWGDEEDSLEHNYYNSIPGKEPPL 
GGLVDSRLALTQPCALTALDQG PS PSLRDACSLP WDVGSTGTAP 
PGDGYVQADARGPPDHEEHLYVNTQGIJ3APEPEDSPKKDLFDMR 

P FEEIALKLHECS VAAGVTAAPLPLEDQW PS PPTRRAPVAPTEEQ 
I*RQEPWYHGRMSRRAAERMLRADGDFT*VRDS\n^PGQYVLTGMH 
AGQPKHIilJjVDPEGVVRTKDVLFESISHLIDHHLQNGQPIVAAE 

SELHLRGWSREP 


5393 


2 


982 


GGDSAGMTMETC^QNVCTRMLWLLQPLTVLliL^^ 
PKAVLKLEPPWINV1>Q\ EDSVTLTCQGAPQ P / ERSDS I QWFHNG 
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SEQ 
ID 
KO: 


Predicted 
beginning 
nucleotide 
location 
corre 3 ponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


| Predicted end 
! nucleotide 
[ location 
1 corresponding 
1 to first 
I amino acid 
1 residue of 
J amino acid 
j sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E« 
Glutamic Acid, F= Phenylalanine, G~Glycine, 
H-Histidine, I -Is ©leucine, K^Lysine, 
L-Leucine, M=Methionine, N=Asparagine , 
P= Proline, Q=Glut amine, R=Arginine, 
S=Serine, T= Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Dnknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








\NLI PTHTQPS \YRPKAN>m\IKGEYTCQTGQTSL\SDPVHLTV 
LSEWLVLQTPHLEFQEGETIMLRCHS \WRDKP\LVKVTFFQNGK 
SQKFSHLiDPTFSIPQANHSHSGDYHC-l-GNIGYTLFSSKPVTITV 
QVPSMGSSSPMGI IVAWIATAVAAIVMWAIil YCRKKRISAN 
STDPVKAAQFEPPGRQMIAI RKRQLEETNNDYETADGGYOTLNP 
RAPTD0DKNIYLTLPPNDHVNSNN 


S394 


2 


1 982 


GGDSAGMTMETQMSQNVCPRNLWLLQPLTVLLIiASADSQAAAP 
PKAVLKL£PPWI1^VL0\EDSVTLTCQGAPQP/ERSDSIQWFH]SJG 
\NIiIPTHTQPS\YRFKANNN\DSGEYTCQTGQTSL\SDPVHIiTV 
LSEWLVLCrTPHLEFX3EGETIMIiRCHS\WRDKP\LVKVTFFQNGK 
SQKFSHLDPTFS I PQANHSHSGDYHCTGN I G YTL FSS KPVTI TV 
QVPSMGSSSPMGI I VAWIATAVAAI VAAWALI YCRKKR I SAN 
STDPVKAAQFEPPGRQMIAI RKRQLEETNNDYETADGGYMTLNP 
RAPTDDDKNIYLTLPPNDHVNSNN 


5395 


3135 


j 531 


RAS D AKNQEGLLNTRRKS TDS VP I S KSTLS R S LS IjQAS D FDG AS 
S SGNPEAVALAPDAYSTGSS SASSTLKRTKKPRP PSLKKKQTTK 
KPTETPP VKETQQE PDEESLVPSGENLAS ETKTESAKTEG PS PA 
LLEET PbEPAAG PKAACPLDS ESVEG W PP ASGGGRVQNS PP VG 
RKTLPLTTAPEAGEVTPSDSGGQEDSPAKGHSVRLEFDYSEDKS 
SWDNQQENPPPTKKIGKKPVAKMPIJIRPKMKKTPEKLDNTPASP 
PRS PAEPNDI PIAJCGTYTFD I DKWDDPNFNP FSSTSKMQ ES P KI> 
PQQS YNFDPDTCDES VDP FKTSS KTPSS PS KS PAS FEI PASAME 
ANGVDGDGI^KPAKKKKTPLKTDTFRVKKSPKRSPLSDPPSQDP 
TPAATPETPPVISAVVHATDEEKLAVTNQKWTCMTVDLEADKQD 
YPQPSDLSTFVNETKFSS PTEELDYRNS YBIEYMEKIGSSLPQD 
DDAPKKQALYLMFDTSQES P VKSS P VRMSES PTPCS GS S FEE TE 
ALVNTAAKNQHPVPRGLAPNQESHLQVPEKSSQKELEAMGLGTP 
SEAIEI TAPEGS FASADALLS RLAHPVS LCGALD YLEPDLAEKN 
PPLFAQKLQREAAHPTDVS ISKTALYSRIGTAEVEKPAGLIjFQQ 
PDLDSALQIARAE I ITKEREVSEWKDKYEESRREVMEMRKIVAE 
YEKT XAQM I EDEQREKS VS \HQTVQQLVX>EKEQA\ IjADIjNS VE K 
XSLADLFRRYEKMKEVI^GFRKNEEVLKRCAQEYIiSRVKKEEQR 
YQAI*KVHA\EEKliDRANAE\ IAQVRGKAQQEQAAHQASLiAERSS 
CRV\DALERTLEQKNKEIEELTKICDELIAKMGKS 


5396 


3135 


531 


RASDAKNQEGLLNTRRKS TDS VPISKS TLSRSLSLQASDFDGAS 
S SGNPEAVAIiAPDAYSTGS S S ASSTLKRTKKPRPPSLKKKQTTK 
KPTETPPVKETOQEPDEESLVPSGENIiASETKTESAKTEGPS PA 
LLEETPLEPAAGPKAACPLDSESVEGWPPASGGGRVQNSPPVG 
RKTLPLTTAPEAGEVTPSDSGGQEDS PAKGHSVRLEFD YSEDKS 
SWDNQQENPPPTKKIGKKPVAKMPLRRPKMKKTPEKLDNTPAS? 
PRS PAE PND I P I ARGTYTFDI DKWDDPNFNPFSS TS KMQES PKL 
PQQSYNFDPE3TCDESVDPFKTSS KTPSS PS KSPASFE I PASAME 
ANGVDGDG LNK PAKKKKTP LKTDTFRVKKS P KRS PLS DP PS Q D ? 
TPAATPETPPV I S AWHATDEE KLAVTNQKWTCMTVDLEADKQD 
YPQPSDLS TFVNETKFSS PTEELDYRNS YEI EYMEK IGSSLP QD . 
DDAPKKQALYLMFDTSQES PVKSSPVRMSESPTPCSGSSFEETE 
ALVNTAAKNQHPVPRGLAPNQESHLQVPE K5SQKELEAMGLGTP 
S EAIE I TAPEGS FAS AD ALLS RLAH PVSLCGALD YLEPDLAE KM 
PPIJAQKUJREAAHPTDVSISKTALYSR IGTAEVEKPAGLLFQQ 
PDLDSALQIARAE I ITKEREVSEWKDKYEESRREVMEMRKIVAE 
YEKTIAQMI EDEQREKSVS\HQTVQQLVLEKEQA\LADLNSVEK 
\SLADLFRRYEKMKEVLEGFRKNEEVLKRCAQEYLSRVKKEEQR 
YQALKVHA\EEKLDRANAE \ I AQ VRGKAQQSQ AAHQAS LAER S S 
CRV\DALERTLEQKNKEIEELTKI CDELIAKMGKS 


5397 


3135 j 


531 


RASDAKNQEGLIiNTRRJKSTDSVPISKSTLSRSLSIjQASDFDGAS 
SSGNPEAVALAPDAYSTGSSSASSTLKRTKKPRPPSLKKKQTTK 
KPTETPPVKETQQEPDEESLVPSGENLASETKTESAKTEGPSPA 
LLEET PLEPAAG PKAAC PLDSES VEGW PPASGGGRVQNS PP VG 
RKTLPLTTAPEAGEVTPSDSGGQEDS PAKGHS VRLBFDYSEDKS 
SWDNQQENPPPTKKIGIOCPVAKMPDRRPKMKKTPEKLDNTPASP 
PRS PAEPNDI P IAKGTYTFDIDKWDDPKFNPFSSTS KMQESP KL 
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SEQ 
ID 
NO: 


Predicted" 
beginning 
nucleotide 

location 
corresponding 

to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cys teine, D=Aspart ic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I^Isoleucine , K=Lysine, 
ULeucine, M^Methionine, N=Asparagine, 
P=Proline, Q =Glut amine , R=Arginine, 
S=Serine, T=»Thxeonine t VsValine, 
W=Tryptopban, Y=Tyrosine, X= Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








PQQSYNFDPDTCDESVDPFKTSSKTPSSPSKSPAS FBI PASAME 
ANGVDGTCI^KPAKKKKTPLKTDTFRVKKSPKRSPLSDPPSQDP 
TPAATPETPPVI SAVVHATDEEKIAVTNQKWTCMTVDLiEADKQD 
YPQPSDLS TFVNETKFSS PTEELDYRNSYE IEYMEKXGSS LPQD 
DDAPKKQADYLMFDTSQESPVKSSPVRMSESPTPCSGSSFEETE 
AL.VNTAAKNQH PVPRG LAPNQESHLQ VPE KSSQ KELEAMGLGT P 
SEAI B I TAP EG S FASADAJLLS RI^PVSL.CGALDYI.EPDLAEKN 
PPliFAQKLQREAAHPTDVS I S KT ALY S R I G T AE VE KP AGI»L F QQ 
PDLDSAL.Q I ARAE I ITKEREVSEWKDKYEESRREVMEMRKIVAE 
YEKT IAQMI EDEQREKSVS \HQTVQQLVLEKEOA\LADLNSVEK 
\SLADLFRRY^KMKEVL^FRKNEEVIiKRCAQEYLSRVKKEEQR 
YQALKVHA\ EEKLDRANAE \ IAQVRGKAQQEQAAHQASLAERS S 
CRV\DALERTIiEQKNXEIEELTKICDEl>IAKMGKS 


5398 


56 


5426 

• 


SGBVCRMESNFNQEG VPRPS YVFSAD P I ARPSE I N FDG I KLDLS 
HEFSI»VAPNTEANS FES KD YJjQVCLR I RPFTQS EKEIjES EG CVH 
ILDSQTVVLKEPQCILGRLSEKSSG\QM\AQKFSFFPGFLGPAT 
TQKEFFG^CIMHP\VXDLIiKGQSRI»IFTYGLTNSGKTYTFQGTB 
ENIRILPRTLNVLFDSLQERLYTKMNLKPHRSRBYLRI^SEQEK 

EE I AS KS AKLRQ I KE VTVHND S DDTL YGS LTNSLN I SEFEES I K 
DYEQANliNMANS I KFSVWVSFFEI YNBYI YDLFVPVSSKFQKRK 
MLRLSQDVKGYSFIKDLQWIQVSDS KEAYRLLKLGI KHQSVAFT 
KIJ^NASSRSHSIPTVKILQIEDSEMSRVIRVSEI»SLCDLAGSER 
TMKTX3NEGERLRETGNINTSLLTLGKCINVLKNSEKS KFQQHVP 
FRESKLTHYF/QSFFNGKGKICMIVNISQCYLAYBETLNVLKFS 
AIAQKVCVPDTIiNSSQEKLFGPVKSSQDVSIiDSNSNSKIlJTVKR 
ATISWENSLEDIiMEDEDLVEELENAEETED/VGETiCLIiDEDLDK 
TLEENKAFI SHEERRKLLDLIEDLKKKIiINEKKEKLTliEFKI RE 
EVTQEFTQYWAQREADFKETIiLQEREI I>EENAERR1jAI FKDI/V/G 
KOm^EAAKDICATKVETEEATACLELKFNQIKAEIAKTKGEL 
IKTKEELICKRENESDSIiIQELETSNKKl ITQNQRIKELINIIDQ 
KEDTINEFQNLKSHMENTFKC^KADTSSLIINNKLICNETVEV 
P KDS KS KI CS 3RKR VNENELQQDE P PAKKGS I HVS S AI TEDQKK 
SEEVRPNI AE I EDIRVLQENNEGLRAFLI»TIENEIiKNEKEEKAE 
LNKQIVHFQQSLSliSEKKNLTLS KEVQQIQSNYDlAIAELiHVQK 
SKNQEQEEKIMKLSNEIETATRS ITNNVSQI KLMHTKI DELRTL 
DSVSQISNIDIJ^NLRDLSNGSECTNL.PNTQLDIjIjGNDYLVSKQV 
KEYRIQEPNRENSFHSSIEAIWEECKEIVKASSKKSHQIEELEQ 
Q I EKIjO^EVKGYiaDENTSnU*KEKEHKNQDDIjLKE KETL I QQLKEE 
LQEXNVTLDVQIQHWEGKRALS ELTQGVTCYKAK I KE I*ETII*E 
TQKVERSHSAKLEQDI LEKESI I L.KLERN1.KE FQEHLQDS VKNT 
KDLNVKELKIiKEEITQLTNOTiQDMKHLLQLKEEEEETNRQETEK 
LKEEIiS ASS ARTQN\LNADLQRKEED YADL KE KLTDAKKQI KQV 
QKEVSVMRDEDKLLRI KI NELE KKKNQC S Q ELDMKQR\ T I QQIiK 
EQI*INQKVEEAI QQYERACKDLNVKEKI I EDMRMTLEEQEQTOV 
EQDQVlA EAJObEEVERLATEIiDRWR VKCNDLE TKNNQRSNKEHE 
NNTDVLG KLTULQDELQESEQKYNADRKKWLE E KMML ITQAKEA 
ENIRNKEMKKYAEDRERFFKQQNEMEI LTAQLTEKDSDLQKWRE 
ERIX}I*VAAIiE IQLKAIiI SSNVQKDNE I EQLKRI I SETS KI ETQ I 
MDIKPKR I SSADPDKLQTEPLSTS FEISRNKI EDGSWLDS CEV 
STENDQSTRFPKPEI»BIQFTPIiQPNKMAVKHPGCrTPVTVKIPK 
ARKRKSNEMBEDLVKCENK^NATPRT^TLKFPISDDRNSSVTCKEQ 
KVAIRPSSKKTYSl^SQASIIGVNLATKKKEGTLQKFGDFLQHS 
PSILQSKAKXI IETMSSSKI^NVEASKEm^SQPKRAKRKLYTSE 
ISSPIDISGQVILMDQKMKESDHQI I KRRLR TKTAX 


5399 


705 


230 


" GPRMAKFIiSQDQ INRYKECFSLYDKQQRGKIKATDLflVAMRCI/3 
AS PTPGEVQRKLQTHG I DGNGELDFSTFLT IMHMQIKQEDPKKE 
ILLAMLMVDKEKXGYVMASDLRS KLTS LGEKLTHKEV\DDLFRE 
\ADIEPNGKVKYDEFIHKITSYLDGTY 


5400 


931 


248 


SHCSSGMEI PPTNYPASRAALVAQNY INYQQGTPHRVFEVQ-CVK 
QASMEDI PGRGHXYRI*KFAVEEI I QKQVKVNCTA2 VLYPS TGQE 
TAPEVNFTFEGETGKNPDEEDNTFYQRLKSMKE PLEAQN I \P0N 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corre sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A= Ala nine , C=Cysteine, D=Aspartic Acid, B= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K= Lysine, 
L^Leucine, MaMethionine, N=Asparagine, 
PaProline, Q=Glutaraine, R=Arginine, 
SsSerine, T-Threonine , V-Valine, 
W*=Tryptophan, Y =■ Tyrosine, X~Unknown, *=Stop 
Codon, /"possible nucleotide deletion, 
\apossible nucleotide insertion) 








FGNVSPEMTLVLHLAWVACG Y 1 1 WQN S T EDTWY KMVKI QTV KQV 
QRNDDFI ELDYT I LLHNIASQE I IPWQMQVLWHPQYGTKVKHNS 
RLPKEVQLE 


5401 


3 


1360 


TG WS YG PTTS LAFLAPRDF P F P PKLLI H PQAWRLS CGAGS MGS 

QAAAEWRNWASWEGSSSI^GCSMG^TFKDDRIVFWTWMFSTYFME . 

KWAPRQDDMLFYVRRKI^YSGSESGATCRKAAEPEVEVWYRRD 

SKKI^IXSDPDIDWEESVCIjNLIIiQKI.DYMVTCAVCTRADGGDI 

HIHKKKSQQVFASPSKHPMDSKGEESKISYPNIFFMIDSF\BE\ 

V FS DMTVG KG EMVCVEL VAS D KTNTFQG V I FQGS I R YEALKKVY 

DNRVSVAARMAQK\MSFGFSKYSNMEF\VR\MKGPQGKGHAEMA 

VSRVSTGDTS PCGTEEDSSPAS PMHERVTS FSTP PTP E RNNRP A 

FFSPSLKRKVPRNRIAEMKXSHSANDSEEFFREDDGGADLHKAT 

NLRS RS LS GTGRSLVGS WLKLNRADGN FLL YAHLTYVTL PLHRI 

LTDILEVRQKPILMT 


5402 


3445 


1563 


GECFIMAAWQQNDLVFEFASNVMEDERQLGDPAI FPAVI VEHV 
PGADILNS YAGLACVEEPNDMITESSLDVAEEE I IDDDDDDITL 
TVEASCHDGDETIETI EAAEALLNMDSPGPMLDEKRINNNIFSS 
PBDDMWAPVTHVS VTLDGI PEVMETQQVQEKYADS PGASS PEQ 
PKRKKGRKTKPPRPDSPATTPNlSVKBOaTKDGKGNTIYLWEFLL . 
ALLQDKATCP KYI KWTQRE KG I FKLVDS KP VSRLWR KHKNKP \ D 
MKYE PMGRALRYYYQRG I LAKVEGQRLVYQFKEMPKDL1 Y I ND E 
DPSSSIESSDPSLSSSATSNRNQTSRSRVSSSPGVKGGATTVLK 
PGNSKAAKPKDPVEVAQPSEVLRTVQPTQSPYPTQLFRTVHVVQ 
P VQAVPEGEAARTSTMQDETLNSS VQS IR \ T I QAPTQV PVWS P 
RNQQ\I^TVTLQTVPLTTVIASTDPSAGTGSQKFILQAI PSSQP 
MTVLKENVMLQSQKAGS PP SXVLG PARV \QQVLTSNVQT I CNGT 
VSV\ASSPSFS\ATAPWTLFLL<3SSQIjVAHPPGTVITSVIKXQ 
ETKTLTQEVEKKESEDHLKENTEKTEO^PQPYVMVVSSSNGFTS 
QVAMKQNELLEPNSF 


5403 


3445 


1563 

■ 


GECFI MAAWQQNDLVFEFASNVMEDERQLGDPAI FPAVI VEHV 
PGADI LNSYAGLACVEEPNDMITES SLDVAEEEI IDDDDDD I XL 
TVEASCHDGDETIETIBAAEALLNMDSPGPMLDEKRINNNI FSS 
PEDDM WAPVTHVSVTLDG I PEVMETQQ VQEKYADSPGAS S PEQ 
PKRKKGRKTKPPRPDSPATTPNISVKKKNKDGKGNTIYLWEFLL 
ALLQDKATCP KY I KWTQRE KG I FKLVDS KPVSRLWRKHKNKP\D 
MNYEPMGRALRYYYQRG I LAKVEGQRLVYQ FKEMPKDLI YINDE 
DPSSSIESSDPSLSSSATSNRNQTSRSRVSSSPGVKGGATTVLK 
PGNSKAAKPKDPVEVAQPSEVLRTVQPTQSPYPTQLFRTVHVVQ 
PVQAVPEGEAARTS TMQDETLNSSVQS I R\TIQAPTQVPVWS P 
RNQQXlJmm^QTVPLTTVIASTDPSAGTGSQKFILQAI PS SQP 
MTVLKENVMLQSQKAGSPPSIVIX3PARV\QQVLTSNVQTICa»rGT 
VSV\ASS PSFS \ATAP WTLFLLGSSQLVAHPPGTVI TSVI KTQ 
ETKTLTQEVEKKESEDHLKENTEKTEQQPQPYVMVVSSSNGFTS 
QVAMKQNELLEPNSF 


5404 


187 


1111 


LPVTLI FAKMKTLQSTLLLLLLVPL I KPAPPTQQDSRI I YDYGT 
DNFEESI FSQDYEDKYLDGKNI KBKETVI IPNEKSLQLQKDEAI 
TPLPPKKENDEMPTCLLCVCLSGSVYCEEVDIDAVPPLPKESAY 
LYARFNKIKKLT\AKDFADIPNLRRLDFTCNLIEDIEDGTFSKL 
SLVEELSIAENQLLKLPVLPPKLTLFNAJCYNKIKSRGIKANAFK 
KLNNLTFLYLDHNALESVPLNLPESLRVIHLQFNNIASITDDTF 
CKANDTS YIRDR1 EEIRLEGNP I VLGKHPNSFICLKRLPIGS YF 


5405 


2199 


1220 


QNSRSLHMDPQNQHGSGSSLWrQQPSLDSRPRLDYEREIQPTA 
II*SLDQIKAIRGSNEYTEGPSVVKRPA^RTAPRQEKHERTHEII 
PINVNNNYEHRHTSHLGHAVLPSNARGPILSRSTSTGSAASSGS 
NSSASSEQGLLGRSPPTRPVPGHRSERAIRTQPKQLIVDDLKGS 
LKEDLTQHKF I CEQCGKCKCGECTAPRTLPS CLACNRQ CLCSAE 
SMVEYGTCMCL\VKG I F YHCSNDDEGDS YSDNPCSCSQSHCCSR 
YLCMGAMSLFLPCLLCYPPAKGCLKLCRRCYDWIHRPGCRCKNS 
NTVYCKLESCPSRGQGKPS 


| 5406 


279 


2732 


RWRTYNVEGPLTFMDVAI EFCLEE WQCLDTAQQNLYRNVMLENY 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Anu.no acid segment containing signal peptide 
(A= Alanine, C= Cysteine . EMAspartic Acid, E= 
Glutamic Acid, P=Phenylalanine , G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine. 
P=Proline, Q-Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptopfaan, Y= Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








RNLVFLG / 1 IAVSKPDLiITCLEQEKEPWEPMRRHEMVAKPPVMC 
SHFTQDFWPEQHIKDPFQKATLRRYKNCEHKNVHLKKDHKSVDE 
CXVHRGG YNGFNQCIiP ATQS K I FLFDKCVKAFHKFSNSNRHKI S 
HTEXKLFKCKE CGKS FCMLSHLAQHKI IHTRVNFCKCEXCGKAF 
NCPSI ITKHlOlINTGEKPY^rCEECGKVFNWSSRljTTHKKrm'RY 
KLYXCEECGKAFNKSS I LTTHKI I RTG EKFYKCKECAKAFNQS S 
NI.TEHKKIHPGE FCPYKCEECGKAFW^PSTLTKHIGilHTGEKPYT 
CEEGGKAF!^OFSNIjTTHKRIHTA\EKFyKCTECGEAPSRS\SNI» 
TKHKE IHTEKKPYKCEECGKAFKWSSKLTEHKLTHTGEKPYKCE 
KCGKAFl^CPSI ITKHNRINTGEKPYTCEECGKVFNWSSR1.TTHK 
KNYTRYKLYKCEECGKAFNKSSILTTHKKIHI EKXFYKCEECGK 
AFKWS SKLTEHK I THTGEKPYXCEECG KAFNHFS I LTKHKR I HT 
GEKPYKCEECGKAFTQSSNI»TTHKKIH'lGEKFYKCEECGKAFrQ 

SSNLTTHKKIHTGGKPYKCEECGKAFNOFSTLTXHKI IHTEEKP 
YKCEE CGKAFKNSSTLTKHKI IHTGEXPYKCEECG\KAFKLSST 
LSTHKI IHTGEKPYXCEKCGKAFNRPSNLIEHKKIHTGEQPYKC 
EECXJKAFNYSSHIJmiKRIHTKEQPYKCKECGKAFNQYSNLTTH 
NKIHTGEKL YKPEDVTVI LTTPQTFSNI K 


5407 


3 


659 


RPRRRQSSCCTGWIiAGWIiLiRAAPRFCRRTETDMEQGKGLAVL, I L 

AIILLQGTLAQS IKGNHLVKVYDYQEDGS VLLTCDAEAKNITWF 
KIXSKMIGFLTEDKKKWNLGSNAKDPRGMYQCKGSQNKSKPI^ 
YRMCQNCIELNAATI SGFLFAEIVS I FDLAVGVYF I AGTGMEFR 
QS\IlASDKOTIXP\NDPAPTQPLKDPPJC^4TQYSHLOGN\Q^^RRN 


5408 


2745 


6128 


QGSKGTCHPQAQQPWDEGVWQEAPSQSEPWGQSQEPPTMPQRI*P 
HARQHTPIiPIGSADYRJRWSVRPOGPHRDPKDSRDAAKREQGS L 
APRP VP AS RGG KTLCKG YRQAPPG P PAQ FQR P I CSAS PPWASRF 
STPCPGGAVREDTYPVGTGGVPSLAIiAOGGPQGSWRFLEWKSMP 
RLPTDL.DIGGPWFPHYDFERSCWVRAISQEDQLATCWQAEHCGE 
VRNKDMSWPEEMS FIANSS KIDRHKVPTEKGATGLSMLGNTCFM 
NSSIQ CVSNTQ PLTQYFI S GRHL YELNRTOP IGMKGHMAKCYGD 
LVQELWSGTQKNVAPIiKLRWTIAKYAPRFNGFQQQDSQELLAFL 
LIX3LHEDLm?VHEKPYVELKDSDGRPDWEVAAEAV7Dmu^RRNRS 
rWDLFHGQLRSQVKCKTCGHISVRFDPFNFIiSLPliPMDSYMHI* 
EITVI KIjDGTTPVRYGIjRLNMDEKYTGIjKKQIiSDLCGLNSEQI l 
LAEVHGSNIKNFPQDNQKVRLS VSGFIiCAFEI P VPVSPISAS S P 
TQTDFSSSPSTNEMFTI»TTNGDLPRPIFI PNGMPN7WPCGTEX 
NFTNGMVNGHMPSLPDSPFTGYI IAVHR KMMRTE L> YFLSSQ KN R 
PSIJGMPLIVPCTVHTRKKDLYDAVMI QVSRIAS PLPPQEASNH 
AODCDDSMGYGYPFTLRVVQKDGIfSCAWCPWYRFCRGCKIDCJGE 
DRAFIGNAYIAVDWHPTALHliRYQTSQERVVDEHESVEQSRRAQ 
VE P INLDS CXiRAFTS EEELGENEM YYCS KCKTHCLATKKLDIiVTR 
LPPILI IHIiKRFQFVKGRW IK3QKIVKFPRES FDPSAFLVPRDP 
ALCX2HKPliTPCGDEI»SEPRirJu^EVlCKVI)AQSSAGEEDVLLSKS 
PSSLSANIISSPKGSPSSSRKSGTSCPSSKNSSPNSSPRTIGRS 
KGRLRLPQ IGSKNK1»S S S KENLDAS KENGAGQ I CEIADALSRGH 
VLGGS QPELVTPQDHEVALANGFIjYEBEACGNGCGNG YSNGQLG 
NHSEEDSTDDQREDTRIKPIYNLYAISCHSGILGGGHYVTYAKN 
PNCKWYCYNDSSCKELHPDEIDTDSAYILFYEQQGIDYAQFLPK 
TDGKKMADTSSMDBDFESDY\EKYCVIjQ 


5409 


2745 


6128 


QGSKGTCHPQAQQPWDEGVWQEAPSQSBPWGQSQEPPTMPQRIiP 
HARQHTPLPLGSADYRRWSVRPGGPHRDPKDSRDAAJCRECGSL 
APRPVPASRGGKTI>CKGYRQAPPGPPAQFQRP I CSAS PPWASRF 
STPCPGGAVREDTYTVGTQGVPSLAIAOGGPQGSWRFLEWKSMP 

RLPTDLD I GG P WFPH YDFERS CWVRAI S Q ED QLATCW QAEHCG E 
VRNKDMSWPEEMSFIANSSKIDRHKVPTEKGATGLSNLGNTCF>1 

NSS IQCVSNTQPLTQYFI SGRHLYELNRTNP IGMKGHMAKCYGD 
I»VQEL»WSGTQKNVAPliKLRWT IAKYAPRFNGFQQQDSQELLAFI* 
LDGLHEDLNRVHEKPYVELKDSIX5RPDWEVAA 
IVVDLFHGQLRSQVKCKTCX3HISVRFDPFNFI*SLPLPMDSYMHIj 

E ITVTKLDGTTPVR YGLRIiNMDEKYTGLKKQLSDIiCGLNSEQII* 
l^AEVHGSNIKNFPQDNQKVRIiSVSGFLCAFBIPVPVSPISASSP 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amine acid 
sequence 


Predicted end 
nucleotide 
location 
cor r e spondi ng 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(AsAIanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G^Glycine, 
H=Histidine, I»Isoleucine, K= Lysine, 
L- Leu cine , M=*Methionine, N=*Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Thxeonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 


* 






TQTDFSSSPSTNEMFXLTTNGDLPRP I FI PNGMPNTWPCGTEK 
NFTNGMVNGHMPSLPDSPFTGYI IAVHRKMMRTELYFLSSQKNR 
PSLFGMPLIVPCTVHTRKKDLYDAVW IQVSRLAS PLPPQEASNH 
AQDCDDSMGYQYP FTLRWQKDGNS CAWCPWYRFCRGCKIDCGE 
DRAF I GNAY I AVD WHPTALHbR YQTSQ ER Wt>EHE S VEQS RRAQ 
VEPI NLDS CLRAFTS EEELGENEM YYCS KCKTHCLATKKLDIjWR 
LPPI LI IHLKRFQFVNGRWI KSQKI VKFPRESFDPSAFLVPRDP 
ALCQHKPLTPQGDELSEPRI LARBVKKVDAQSS AGEEDVL LS KS 
PSSLSANIISSPKGSPSSSRKSGTSCPSSKNSSPNSSPRTLGRS 
KGRLRLPQ IGS KNKLSS S KENIjDAS KENGAGQ I CE LADALS RGH 
VLGGSQPELVTPQDHEVALANGFLYEHEACGNGCGNGYSNGQLG 
NHSEEDSTDDQREDTRI KPI YNLYAISCHSGILGGGHYVTYAKN 
PNCKWYC YNDS S CKELH P DE I DTDSAY I LFYEQQG I DYAQFLPK 
TDGKKMADTSSMDEDFESDY\EKYCVLQ 


5410 


2 


710 


LRFPGQARHVWLAARMQAPHf03iLYKIjIiVIGDLGVGKTS 1 1 KRY 
VHQNFSSHYRATIGVDFALKVLHWDPETVVRL01,WDIAGQERFG 
N>n^VYYREAMGAP I VFDVTRP ATFEAVAKWKNDLDSKLS L PN3 
KPVSVVIjIjANKCDQGKITVnJMlWGI, 

ENIN I DEASRCLVKHI LANECDLMES I EPDWKPH1»TSTKVASC 
SG\CAKI LVGTFAGVW 


S411 

* 


1302 


289 


TGPAAAGRRKALGS FGKPS P VTGLRAAR RRRTR PS APAAPS VGC 
GKRRESDAGAGGERASVRTGSGRRGGRTMAGDS EQTLQNHQQPN 
GGEF FL IGVSGGTASGKS S VCAK I VQ LLGQNEVDYRQKQ WI LS 
QDS FYR VL TSEQKAKAL KG Q FNFDHPDA FDNEL ILKTLKEITEG 
KTVQIPVYDFVSHSRKEETVTVYPADVVLFEGIIjAFYSQBR/IR 
DLFQMKLFVDTDADTRIiSRRVLKDlSERGRDLEQILSSSTLRFV 
KPA\ FEEFCLPPK\KYADVI I PR\GADN\RVPINL I VQHIQ\D I 
LNGGPSXNRQTNGCUTGYTPSRKRQASESSSRPH 


5412 


31B0 


313 


QGISNFFHKEANFWFEVSG YI* ISPLRSPFVDPAI>EWSLMAS pwn 
KMEGESSRPEIHTPVSDKKKKKCSIHKERPQKHSHE I FRDSSLV 
NEQSQITRRKKRKKDFQHIilSSPLKKSRICDETANATSTLKKRK 
KRRY S ALEVDE EAGVTVVL VDKE N INNT P KHFR KD VD W CVDM S 
I EQKL PR K \ PKTDKFQ VLAKS H \AHKS EALHS KVRE KKNKKHQ R. 
KAASWESQRA\RDTLPQSEFPTQEESWLSVGPGGEITELP\ASA 
HKNKS KKKKKKSSNRE YET\IiAMPEGSQAGREAGTDMQESQPTV 
GI^DETPQLICPTHIOCKSKKXKKKKSNHQEFESLAMPEGSQVGS 
EVGADMQES \RPAVGLHGETAGI PAPAYKNKSKKKKKKSNHQEF 
EAVAMPES LESAYPEGSQVGS E VGTVEGS TALKG FKESNSTKKK 
SKKRXI/TS VKRARVS GDDFS VPSKNSESTIiFDSVEGDGAMMEEG 
VKS RPRQKKTQACLAS KHVQEAPRLE PANEEHNVETAEDSE I RY 
LSADSGDADDSDADLGSAVKQLQEFIPNI KDRATSTI KRMYRDD 
LERFKEFKACK^AIKFGKFSVKEWKQLEKNVEDFIALTGIESAD 
KLLYTDRYPEEKS VI TNLKRRYSFRLHIG \RNI ARPWKLI YYRA 
IOCMFDVNNYKGRYSEGDTEKLKMYHSLLGNDWKTIGEMVARRSL 
S VALKFSQ ISSQRNRGAWS KS E TRKL I KAVEEVI LKKMS PQELK 
EVDSKLQEKTPESCLS I VREKLY KG I S W VE VE AKVQTRNWMQC KS 
KWTEILTKRMTNGRR X Y YGMNAL RAKVS L I ERLYE INVEDTNE I 
DWEDLAS AI GDVP PS YVQTKFS RI*KAVYVP FWQKKTF P E I IDYL 
YETTI^LLKEKLEKMMEKKGTKIQTPAAPKQVFPFRDIFYYEDD 
SEGGGHRKRKRRPRRHAWFTPVIPVLWEAKAGWII 


5413 

• 


3 753 


1304 


RFPAGVAPRRAMANVS KKVS WSGRDRDDEEAAPIjLRRTARPGGG 
TP]^NGAGPGAARQSPRSALFRVGHMSSVKI>DDELLEP\DMDPP 
HPFPKEIPHNEKLLSIiKYESLDYDNSEKQLFLEEERRINHTAFR 
TVEIKRWVICALIGILTGLVACFIDIVVEIILAGLKYRVIKGNID 
KFTEKGGLS FSLLLWATLNAAF VLVGSVIVAF I EPVAAGSG I PQ 
I KCFLNGVKI PHWRLKTLV I KVSGVT I*S WGGLAVGKEG PM I H 
SGSVIAAGISQGRSTSLKRDFKI FEYI*RRDTEKRDFVSAGAAAG 
VSAAFGAPVGGVLFSLEEGAS FWNQFLTWRI FFASMI STFTLN F 
VLS I YHGNMWDLSS PGLINFGRFDSEKMAYTI HEIPVFIAMGW 
GG VLG AVFNALNYWLTMFRIR Y I HRPCX.QVI EAVLVAAVTATVA 
FViaYSSRDC^PLQTCSMSYPLQLFCADGEYNSMAAAFFNTPEK 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine , C=Cysteine , D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine , G=»Glycine, 
H=Histidine, I=lsoleucine, K«=Lysine, 
L= Leucine , M=Methionine, N=Asparagine , 
P- Proline, Q<=Glutamine , R-Arginine, 
S=Serine, T=Threonine, v=Valine, 
W=Tryptophan , Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








SWSLFHDPPGSYNPLTL.GLFTLVYFFLACWTYGLTVSAGVFI p 
SLL I GAAWGRL FGISLSY LTGAAI WADPG KYALMG AAAQLGG I V 
RMTLS LTVI MMEATS NVTYGFP I ML VLMT AKI VGDVF I EG L YDM 
HIQLQSVPFI»HWEAPVTSHSI*TAREVMSTPVTCXiRRREKVGVI V 
DVLS DT ASNHNG FP VVE HADDTQ P ARLQGL I LRS QL I V LL KHKV 
FVERSNLGLVQRRLRLKDFRDAYPRFPP IQS IHVSQDERECTKD 
LSE FMNPS P YTVPQEAS L P RVPKX.FRALGLRHLWVDNRNQ WG 
LVTRKDLARYRLGKRGLEELSIiAOr 


5414 


2130 


390 


GVASAWDRALFS PLLS PTSRVFRTSPPRCVSTETGRRDRARVPS 
QWCSVLQGKLPVSGRTSLACVRS I LLSPASSPRKVG 1 VGGTGAR 
AGAAPRDHGRVRHRRPSSARRWTRTTGQCLtAPRGCQGPRGTRSP 
RSPRSRTRRGCSASPACLiP /CRSALIVAVLCYINr iLtNYMDRFTV 
AGVLPD I EQFFN I GD S S S G LI QTVF I SS YMVLAPVFG YLGDR YN 
RKYLMCGGIAFWSLVTLGSSFI PGEHFWLLLLTRGLVGVGEASY 
STIAPTLIADLFVADQRSRMLSIFYFAIPVGSGLGYIAGSKVKD 
MAGDWHWALRVTPGLGVVAVLLLFLVVREPPRGAVERHSDLPPL 
NPTSWWADLRALARNPS FVXiSSLGFTAVAFVTGSLAIiWAPAFLL 
RSRWLGETPPCLPGDS CSSSDSLI FGLITCLTGVLGVGLGVEI 
SRRLRHSNPRAD PLVCATGLLGS AP FL FLSLACARGS 1 VAT Y I F 
IFIGETLIiSMNWAIVADI LLYWI PTRRSTAEAFQ I VLSHLLGD 
AGSPYLIGLISDRLRRIWPPSFI^EFRALQFSI^LCAFVGALGG 

AAFLGTAHLH 


5415 


693 


2986 


IPPKTKLELQKH \LTTLT \NQEQATI FEEVQKLRPRNEQRENBL 
IISPLRCLFEEKQKEHIHIGEMKQTSQMAAENIGSEIjPPSATRF 
RLDMLKNKAKRSLTESLES ILSRGN2CARGLQRHS I SVDLDSSLS 
STLSNTS KEPSVCEKEAJLP ISESSPKLLGSSEDLS SDSESHLPE 
E PAPLS PQQAFRRRANTLSHFPIECQEPPQPARGS PGVSQRKLM 
RYHSVSTETPHEKKDFES KANHLGDSGGTFVKTRRHSWRQQI FL 
RVATPQKACDSSSRYEDYSELGELPPRSPLEPVCEDGPFGPPPE 
F KKRTgPiELPFLW OFAILOO T t -jI- t iRME y ^0^7 A .Q RNTrt .T^NKg 
LKLD Y EE I TPCLKE VTTVWEKMLSTPGRS KI KFDMEKMHSAVGQ 
GVP\RHHRGEIWKFLAEQFHLKHQPPSKQQPKDVPYKELLKQLT 

SQQHAILIDLGRTFPTHPY FS AQIiGAGQLSLYN I LKAYSLLDQE 
VGYCQGLS F VAG I LLLHM S EE EAFKMLKFLMFDMGLRKQYR P DM 
1 1 LQI QMYQLSRLLHD YHRDLYNHLK EHEIG PSLYAAPWFLTMF 
ASQFPLGFVARVFDMI FLQGTEVIFKVALSLtiGSHKPLILQHEN 
LETIVDF I KS TL PNLG LVQME KT INQVFEMDIAKQLQA YE VEYH 
VLQEEL I DS S PLS DNQRMDKI^KTNS SLRKQNLDLLEQIjQVANG 
RIQSLEATIEKLLSSESKLKQAMLTLELERSAl^QTVFIELRRRS 
AKPSDREPECTQPEPTGD 


5416 


27 


4074 


KSQLFCFWGGKAGDI lsgdqdkeqkdpyfvetpygyqldldflk 

YVDDIQKGNTIKRLNIQKRRKPSVPCPEPRTTSGQQGIWTSTES 
LSSSNSDDNKQCPNFLIARSQVTSTPISKPPPPLETSLPFLTIP 
ENRQLPP P S PQLP KHNLHVTKTIJ»4ETRRRLEQERATMQMTPGE F 

rrpriasfggmgttsslpsfvgsgnhnpakhqlqngyqgngdyg 
syapaapttssmgss irhsplssgistpvtnvspmhlqhireqm 
aialkrlkeleeqvrti pvlqvki svlqeekrqlvsqlknqraa 
sqinvcgvrkrsysagnasqleqlsrarrsggelyidyeeeeme 
tveqstqrikefrql\tadmqaleqkiqdssceasselrengec 

RSVAVGAEENMND IWYHRGS RS CKDAAVGTLVEMRNCGVS VTE 
AMLGVMTEADKEIELMG^IESLKEKrYRLEVQIJlETTH^ 
KLKQELQAAGS RKKVDKATMAQPLWSKVVEAWQTRDQMVGSH 
MDLVOTCVGTSVETNSVGISCQPECK^^CVVGPELP^1NWWIVKBR 
VEMHDRCAGRSVEMCDKSVSVEVSVCETGSNTEESVNDLTLLKT 
NLNLKEVRS IGCGDCSVDVTVCS PKECASRGVNTEAVSQVEAAV 
MAVPRTADQDTSTDLEQVHQFTNTETATL I ESCTNTCLS TLDKQ 
TSTQTVETRTVAVGEGRVKDINSSTKTRS IGVGTLLSGHSGFDR 
PSAV^CTKESGVGQINIlTONyLVGIJCMRTI ACGPPQLTVGLTASR 

RSVGVGDDP VGESLENPQ PQAPLGMMTGLDHY I ER I QKLLAEQQ 
TLLAENYSEIAEAFGEPHSC^GSI^SQLISTLSSINSVMKSAST 

EEUEUIPDFQKTSLGK1TGS YLGYTCKCGGLQSGS PLSSQTSQPB 
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ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corr e spondi ng 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A= Alanine , C=Cysteine , D=Aspartic Acid, Es 
Glutamic Acid, F=Phenylalanine, G«Glycine, 
H=Histidine, I=Isoleucine, K= Lysine , 
L= Leucine, M- Methionine, N^Asparagine, 
P*=Proline, Q=Glutamine, R-Arginine, 
S -Serine, T-Threonine, V=Valine, 
W= Tryptophan, Y=Tyrosine, X== Unknown , *sStop 
Codon, /=possible nucleotide deletion, 
\-possible nucleotide insertion) 








QEVGTSEGKPISSLDAFPTQEGTLSPVNLTDDQIAAGLYACTNN 
ESTLKS IMKKKDGNKDSNGAKKNLQFVG INGG YE TTSSDDSS SD 
ESSSSESDDEC3DVIEYPLEBEEEEEDBDTRGMAEGHHAVNIEGL 
KSARVEDEMQVQBCEPEKVB IRERYELS EKMLSACNLLKNTIND 
PKALTSKDMR FCLNTLQHEWFRVS SQKSA I PAMVGDY IAAFEAI 
S PDVLR YV I NLADGNGNTALHYSVS HSNFE I VKLLUDADVCNVD 
KQNKAGYTPIMLJ\ALAAVEA£KDMRIVEELFGCGDVNAKASQAG 
QTALflliAVSHGRIDMVXGLLACGADVN I QDDEGSTALM CAS EHG 
HVEIVKLLLAQPGCNGHLEDNDGSTALS IALEAGHKDIAVLLYA 
HVNr AKAQS PGTPRLGRKTS PGPTHRGS FD 


5417 


27 

• 

• 


4074 

- 


KSQLFCFWGGXAGDILSGDQDKEQKDPYFVETPYGYQLDLDFLK 
YVDDI QKGNT I KRLN I QKRRKPS VPCP EPRTTSGQQG I WTSTES 
LSSSNSDDNKQCPNFLIARSQVTSTPI S KPPP PLETSLPFLT I P 
ENRQLPPPS PQLPKHNLHVTKTLM2TRRRLEQERATMQMTPGEF 
RRPRLAS FGGMGTTSS LPSFVGSGNHNPAKHQLQNGYQGNGDYG 
S YAPAAPTTSSMGSS I RH3PLSSG I STPVTNVSPMHLQKIREQM 
AIALKRLKELEEQVRTI PVLQVKIS VLQEEKRQLVSQLKNQRAA 
SQINVCGVRKRSYSAGNASQLEQLSRARRSGGELYIDYEEEEME 
TV EQSTQR I KEFRQL\ TADMQALEQKIQDS S CEASS ELRENGE C 
RSVAVGAEENMNDIVVYHRGSRS CKDAAVGTLVEMRNCG VS VTE 
AmjGWrEADKEIELOXXTTIESLKEKZYRLEVQLRETTHDREMT 
KLKQ ELQAAG S RKKVDKATMAQ P L WSKVVEAWQTRDQMVGSH 
MDLVDTCVGTSVETNSVGISCQPECKNKVVGPELPMNWWIVKER 
VEMHDRCAGRSVEMCDKSVSVEVSVCETGSNTEESVNDLTLLKT 
NLNLKEVRS IGCGDCSVDVTVCS PKECASRG VNTEAVS QVEAAV 
MAVPRTADQDTS TDLEQVHQFTNTETATLI ESCTNTCLS TLDKQ 
TSTQTVETRTVAVGEGRVKDINS STKTRS IGVGTLLSGHSGFDR 
PSAVKTKESGVGQININDNYLVGLKMRTIACGPPQI.TVGLTASR 
RS VG VGDDP VGES LENPQPQAPLGMMTGLDHY I ERIQKLLAEQQ 
TLLAENYS ELAEAFGE PHSQMGSLNS QL I STLSS I NS VMKS AST 
EELRNPDFQKTSLGKITGS YLG YTCKCGGLQSGS P LS S QTSQPE 
QBVGTSEGKP I SS LDAF PTQEGTLS P VNLTDDQ I AAG L YACTNN 
ESTLKS IMKKKDGNKDSNGAKKNLQFVG INGG YETTSSDDS S SD 
ESSSSESDDECDVIEYPLEEEEEEEDEDTRGMAEGHHAVNIEGL 
KSARVEDEMQVQECEPEKVSIRERYELSBKMLSACmiLKNTIND 
PKALTSKDMRFCLNTLQHEWFRVSSQXSAI PAMVGDY IAAFEAI 
SPDVLRYVINLAIX5NGNTALHYSVSHSNPEIVKLLLI3ADVCNVD 
HQNKAG YTP I MLAAIiAAVEAEKDMR I VEJ2LFGCGDVNAKAS QAG 
QTALMLAVSHGRI DMVKGLLACGADVNI QDDEGSTALM CAS EHG 
HVB I VKLLLAQPGCNGHLEDNDGSTALS IALEAGHKDIAVLLYA 
HVNFAKAQS PGTPRLGRKTS PGPTHRGS FD 


54X8 


24 


1133 


S VPRAGGDMBTGAAELYDQALLG I LQH VGNVQDFLRVLFG FLYR 
KTD FYRLLRH P S DRMGF P PGAAQ AL VLQ V FKT FDHMARQDDE KR 
RQELEEKIRRKEEEEAKTVSAAAAEKEPVPVPVQEIEIDSTTEL 
DGHQEVEKVQ PPG P VKEMAHGS QEAEAPGAVAGAAE VPR\ E P ? I 
LPRIQEQFQKNPDS YNGAVRENYTWSQDYTDLEVRVPVPKHVVK 
GKQVSVALSSSSIRVAMLEENGERVLM3GKLTHKINTESSLWSL 
EPGKCVLVNLS KVGEYWWNAI LEGEEP IDIDKINKERSMATVDE 
EEQAVLDRLT FDYHQKLQG KPQSHE LKVHEMLKKGWDAEGS PFR 
GQRFDPAMFNIS PGAVQF 


5419 


1395 


259 


GTHPLDPDLVSRTSVQGPLMTMACPGMSDTEESPFLGPRAAEEG 
SESEACEAFGRRKSEEF^RRSDTSGFGRSRKHKVNWKHPERADA 
KDPASLPQC/ljGP/DCVRPAQPSSKYCSDDCGMKLAANRIYEIL 
PQRIQQWQQSPCIAEEHGKKLLERIRREQQSARTRLQEMERRFH 
ELEAI ILRAKQQAVREDEESNEGDSDDTDLQIFCVSCGHPINPR 
VALRHMERCYAKYESQTS FGSMYPTR IEGATRLFCDVYNPQSKT 
YCKRLQVLCPEHSRDPICVPADEVCGCPLVRDVFELTGDFCRLPK 
RQCNRHY C^^KLRRAE VDLERVRVWY KLDELFEQERNVRTAMTN 
RAGIJ^ALMLHQTIQHDPLTTDLRSSADR 


5420 


117 


1733 


NEAGGAC P FKGG ASGRLYLS PRLPRVS VAGCEER PLGW VVfVLGG 
GGFLPARPPPAQRHLGFSHAEQSMEAPDYEVLSVREQLFHERIR 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corre spondi ng 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
reoidue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A= Alanine, c=Cysteine , D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I^Isoleucine, K=Lysine, 
L^Leucine, M^Methionine, N=Asparagine , 
P= Proline, Q=Glutamine , R-Arginine, 
S=Serine, T=Threonine, V=Valine, 
w-Tryptophan, Y=Tyrosine , X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








EC 1 1 STLLFATLY ILCHXFLTRFKKPAEFTT \GMMKMPPSTRL/ 
LLELCTFTLAIALGAVLLLPFSIISNEVLI^LPRNYYIQWLNGS 
LI KG L WNL VFLFS NLS L I FLMP FAYFFTESEGFAGSRKGVLGRV 
YETVVMIjMLLTIJL.VIjGMVWVASA 3 VDKNKANRESkYDFWEYYIjP 
YLYSCISFLGVLLLLVCTPLGIJtfWFSV^^ 

QL YCS AFEEAALTRR I CN PTS CWL PLDMELLHRQVLiALQTQRVL 
LEKRRKASAWQRNLG YPLAMLCLLVLTGLSVLIVA1HI LELLID 
EAAMPRGMQGTSLGQVSFSKl^SFGAVTOVVLIFYLWVSSWGF 
YS S PL FRS LR PRWHDTAMTQI I GN CVCLLVLS S ALP VPS RTLGL 
TRFDLLGDFGRFNWLGNFYIVFLYNAAFAGLTTLCLVKTFTAAV 
RAEL I RAFGE RE 


5421 

- 


117 

• 


1733 


NEAGGACPFKGGASGRLYl^PRLPRVSVAGCEERPLGWWVLGG 

GGFLPARPPRAQRHLGFSHAEQSMEAPDYEVLSVREQLFHERIR 

ECI 3 STLLFATLYILCHI FLTRFKKPAEFTT\GMMKMPPSTRL/ 

I^ELCTFTIAIAIXSAxnjLLPFSIlSNEVLLSLPRKYYIQWLNGS 

LIHGL WNLVFLFSNLS L I FLM PFAYFFTESEGFAGS RKGVLGR V 

YETVVMLMIXTLLVIJG^mroASAIVDKNKANR^ 

YLYSCI SFLGVl^LLLVCTPLGLARMFSVTGKLLVXPRLLEDLEE 

QLYC^FEEAALTRRXC^PTSCl^PLDMELLHRQyLAJuQTQRVL 

]^KRRKASAWQRm^PLAWLCLLVLTGLSVLIVAIHILELLID 

EAAMPRGMQGTSLGQVSFSKLGSFGAVIQVVLIFYLMVSSVVGF 

YSSPLFRSLRPRWHDTAMTQI I GNCVCLLVLSS AL P VF SRTLGL 
TRFDLLGDFGRFNWLGNFYIVFLYNAAFAGLTTLCLVKTFTAAV 

RAELIRAFGERE 


5422 


3 


1263 


S CGES LPTWLAGASRPG I GRKGGAWGGRGGSS PAQVliLS PG P VF 
KAGCNWWHI*SRDQAG VQRCDLGSSQ F PP LGFKRFS CLSLP S S WD 
YRSTVLCVSKMEADLSGFNIDAP'RWDORTFIiGRVKHFLNITDPR 
TVFVSERELDWAKVMVEKS RMGWP PGTQVEQI.T .YAKKLYDSAF 
HPDTGEKMNVI GRMSFQLPGGMI ITGFMLQFYRTMPAVIFWQWV 
NQS FN AL VNYTNRNAAS PT S VRQMALS Y FTATTTAVAT AVGMNM 
LTKKAP P L VGRWVP FAAVAAAN CVNI PMMRQQEL I KG I CVKDRN 
ENElGRSRRAAAIGITQWISRITMSAPGMIIoLPVIMERLEKLH 
FMQKVKVL/SAPLQ VMIiSGCFL I FM VPVACGLFPQ KCELPVS YI* 
EPKLQDTI KAKYGELEPYVYFNKGL 


5423 


3186 

• 


905 


GVSMALGEEKAEAEASEDTKAQSYGRGSCRERELDI PGPMSGEQ 
P PRLKAEGGLIS PVWGAEG I PAPTCW IGTDPGGPSRAHQPQASD 
ANRE PVAERSE PALSGLPPATMGSGDLIJjSGESQVEKTKIjSS SE 
EFPQTLSLPRTTI CSGHDABTEDDPSI*AI>LPQALDI*SQQPHSSG 
I^GLSQWKSVLSPGSAAQPSSCSISASSTGSSLQGHQERAEPRG 
GSIiAKVSSSLEPVVPQEPSSVVGriGPRPQWSPQPVFSGGnASGIi 
GRRRLSFQAE YWACVLPDSLPPS PDRHS PLMNPNKE YEDLLD YT 
YPLRPGPQLPKKLDSRVFADFVLQDSGVDLDSFSVSPASTLKSP 
TNVS PNCPPAEATALPFSG PRE PSLKQWPSRVPQKQGGMGIiASW 
SQLASTPRAPGSRDARWERRE PALRGAKDRLTIGKHLDMGSPQIi 
RTRDRGWPSPRPEREKRTSQSARRPTCTESRWKSEEEVESDDEY 
LALPARLTQVS S LVS YLGS I STLVTLPTGD I KGQS PLEVSDS DG 
PAS FPS S S SQSQLP PGAALQGSGDPEGQNP CFLRS FVRAHDS AG 
EGSlJGSSQAIiGVSSGLLKTRPSIiPARIiDRWPFSDPDVEGQLPRK 
GGEQG KE S LVQC\ VKTFC\ CQLEEIiICWL YNV\AD VTDHGTPAR 
SNLTSLK\ SSLQLYRQFKI01IDEHQSLTESVLQKGE ILLQCLLE 
NTPVLEDVLGRIAKQSGELESHADRLYDS I LASLDMLAGCTL I P 
DKKPMAAMEHPCEGV 


5424 


3186 


905 


G VSMAIX3EEKAEAEASEDTKAQS YGRGS CRERELD I PG PMSGBQ 
PPRLEAEGGLISPWGAEGIPAPTCWIGTDPGGPSRAHQPQASD 
ANREPVAERSEPALSGLPPATMGSGDLLLSGESOVEKTKLSSSE 
EFPQTLSLPRTTI CSGHDADTEDDPSLADLPQALDLSQQPHSSG 
LSCOSQWKSVLSPGSAAQPSSCSISASSTGSSLQGHQERAEPRG 
GSIJVKVSSSLEPVVPQEPSSWGIX5PRPQWSPQPWSGGDASGL 
GRRRLS FQAEYWACVLPDSLP PS PDRBS P L WNPNKBYE DLLD YT 
YPIJtPGPQLPKHLDSRVPADPVLQDSGVDLDSFSVS PASTLKS P 
TNVSPNCPPAEATALPFSGPREPSLKQWPSRVPQKQGGMG1ASW 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corr e spending 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
cor re spending 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C-Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, P= Phenylalanine, G~Glycine, 
H=Histidine, I=Isoleucine, K-Lysine, 
Lr. Leucine , M=Methionine, N=Asparagine, 
P= Proline, Q=Glutamine, R«Arginine, 
S=Serine, ^Threonine, V=Valine, 
W=Tryptophan, Y= Tyrosine, X=UnJcnown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








SQIJ^TPRAPGSRDAJlWERREPAIiRGAKDRLTIGKHLDMGSPQL 
RTRDRGWPS PR P ER EKRTSQS AR RPTCTES RWXSEE EVES DDE Y 

LALPARLTQVSSLVSYXiGSISTLVTLPTGDIKGQSPLEVSDSDG 
PASPPSSSSQSQbPPGAALQGSGDPEGQNPCFLRSPVRAHDSAG 
EGSLGSSQAJL^XTSSGIJjKTRPSLPARXjDRWPFSDPDVEGQLPRK 
GGEQGKES LVQC\ VKT FC \ CQLEEL I CWLYNV\ADVTDHGTPAR 
SNTjTSLKXSSZjQLYRQPKXDIDEHQSIjTESVIjQKGEIIiLQCLLE 
NTPVLEDVLGR IAKQSGE^ESHADRLYDS ILASLDMLAGCTLIP 
DKKPMAAMEHPCEGV 


5425 


1086 


115 


GFCPSPSIX3HQPPRVLHPT>ISMAVETFGFF^TVGLIJ4IX3VTLP 
NSYWRVSTVHGNVTTTNTI FENLWFSCATDSLGVYNCWEFPSML 
ALSGYIQACRALMITAI LLGFBGLLLGIAGLRCTNIGGLEL^RK 
AKLAATAGAPH \ I L>PG I CGMVA I \ S W YA FN I TR \ DFS DPL YPGT 
KYELGPALYLGWSASLI S ILGGLCLCSACCCGSDEDPAAS ARRP 
YQAP VS VMP VATS DQ EGDS 3FGKYGRNALR VAAL»CRGPRCL PTA 
PKKRGPGRG P FP YSNLRGRPRP VPVAPPRPRPR VLHSHG PSQAK 
NCSWEVAYLPSEAGSIiI f 


5426 


42 


3435 


ATS SQSLGRADP PRGGTMERSPGEGPSPS PMDQPSAPSDPTDQP 
PAAHAKPDPGSGGQPAGPGAAGEALAVLTS FGRRLLVLI PVYLA 1 
GAVGLSVGFVLFGtiAL»YLGWRRVRD£XEKSI*RAARQLLDDEEQL 
TAKTLYMSHRELPAWVS FPDVE KAEWLNKI VAQVWPFLGQYMEK 
LLAETVAPAVRG S NPHIjQTFTFTR VEIiGEKPLR I IGVKVHPGQR 
KEQ I LLDLNI S YVGDVQ I D VEVKKY FCK^G VKGMQ LHGVLR VTL 
BPLIGDLPFVGAVSMFFIRRF^I^INWTOMTNLI>DIPGLSSLSD 
TMIMDS IAAFLVLPNRLLVPLVPDLQDVAQLRSPLPRGI IRIHL 
liAARGLSSKDKYVKGLIEGKSDPYALVRIXJTQTFCSRVIDEELN 
PQWGETYEVMVHEVPGQEIEVEVFDKDPDKDDFIjGRMKIjDVGKV 
LQAS VLDDWF PLQGGQGQVHLR LEWLS LLSDAEKLEQVLQ WNWG 
VSSRPDPPSAAILVVYLDRAQDLPMVTSEXYPPQLKKGNKEPNP 
MVQLS IQDVTQES KAVYSTNCPVWBEAFRFFLQDPQSQELDVQV 
KDDSRAI*TXGAI*TLPIiARI*LTAPELtIiDQWFQI»SSSGPNSRIjYM 
KLVMRILYI^SSEIO^TVPGCPGAWDVDSENPQRGSSVJDAPPR 
PCHTTPDSQFGTEHVLR IHVbEAQDLIAKDRFIiGGLVKGKSDPY 
VKLKLAGRSFRSRVVRE»LNPRWNEWEVT\rrSVPGQEI£VEVF 
DKDLDKDDFI^RCKVRIiTTVTiNSGFLDEWLTLEDVPSGRLHLRL 
ERLTPRPTAAELEEVLQVNSIjIO^KSAELAAALLSXYMERAED 
LPLRKGTKHLS P YATLTVGDSSHKTKTI SQTSAPVWDESAS FLI 
RKPKTESLFXQVRGEGTGVl^Sr>SLPI>SEIiVAD^ 
SSGQGQVLL&AQU5 1LVSQHSGVEAHSHS YSHS SSSLSEEPELS 
GGPPHI TSSAPEV\ RQRLTHVDS PLEAPAGP LGQVXLTLW Y YSB 
E RKL VS I VHGCRS LRQNGRD P PD P YVS LLLL P DKNRGTKRRTS Q 
KKRTLS PEFNERFEWEIiPIiDEAQRRKLDVS VKSNSS FMSREREL 
LGKVQLDLAETDLSQGVARWYDLMDNKDKGSS 


5427 


42 


3435 


ATSSQS LGRADPPRGGTMERS PGEGPS PS PMDQPSAPSDPTDQP 
PAAHAKPDPGSGGQPAGPGAAGEAIAVLTSFGRRIiLVLIPVYLA 
GAVGIiSVGFVLFGIJUjYI^WRRVRPEKERSIiRAARQIiLDDEEQL. 
TAKTLYMSHREI>PAWVS FPDVE KAEWLNKI VAQVWPFLGQYMEX 
LLAETVAPAVRG SN P HLQT FTFTR VELG EKPLR I IGVKVHPGQR 
KEQIIJaDLNI S YVGDVQID VEVKKYFCKAGVKGMQLHG VLR VI L» 
EPLIGDLPFVGAVSMFFIRRPT1oDI1JWTGMTNI*I»DIPGIjSSLSD 
TMIMDS IAAFLVLPNRLLVPLVPDLQDVAQLRSPLPRGI IRIHL 
LAARGLSSKDKYVKGLIEGKSDPYALVRLGTQTFCSRVIDEELN 

pqwgetyevmvhevpgqbie^7evfdkdpdkddf^grmkldvgkv 

u2asvi^dwfplox3g<k3qvhi*riew^ 

vssrpdppsaailvvyldraqdlpmvtselyppolkkgnkepnp 

mvqlsiqdvtqeskavystncpvweeafrfflqdpqsqeldvqv 

kijdsraltlgaltlplarlltapeiiildqwfqlsssgpnsrlym 

klvmrilyldsseicfptvpgcpgawdvdsenpqrgssvdappr 

pchttpdsqfgtehvlr ihvleaqdliakdrfxigglvkgksdpy 

vklklagrs frjshwredlnprwnevfevi vtsvpgqelbvevf 
dki3l^kddflgrckvrlttviinsgfldewltledvpsgrlhlrl 
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SEQ 
ID 

NO: 


r 

Predicted 

beginning 

nucleotide 

location 

cor re spondi ng 

to first 

amino acid 

f "J Hi* <~> •f 

J— Cf ^ *J-VA ^ \J -A- 

amino acid 
sequence 


Predicted end 

nucleotide 

location 

c orre spondi ng 

to first 

amino acid 

residue of 

amino acid 

sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine , D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, Idsoleucine, K- Lysine, 
L=Leucine , M=Me thicnine , N=Asparagine , 
P=Proline, Q=Glutarrine, R=Arginine, 
S=Serine, T=Threonine , V=Valine, 
H=:TirvT5tOT5ha.n YsTvrosine . X=Ontanovri * — c t- «r> 

Codon, /=possible nucleotide deletion. 
\=possible nucleotide insertion) 








ERLTPRPTAAELJiEVI^VNSLIQTQKSAE 

LPLRKGTKKLS PYATLTVGDSSHKTKTI SQTSAPVWDESASFLI 
RKPHTESLEL^VRGEGTGVLGSLSLPLSELLVADQLCLDRWFTL 
SSGQGQVLiRAQLGILVSQHSGVEAHSHSYSHSSSSLSEEPRTKS 
GGPPHITSS APEV\RQRLTHVDS PLEAPAGPLGQVKLTLWYYSE 
ERKLVSIVHGCRSLRQNGRDPPDPYVSIJ^LLPDKNRGTXRRTSQ 
KKRTLSP E FNER FEWE L P LDEAQRR KLDVS VKSNSS FMSREREL 
I^KVQI^LAETDLS0X3VARWYDLMDNKDKGSS 


5428 


3 


1839 


SSRSEP^ACAIAPPWLVSSRPAJlPAQI^RPGKMN^IXJAEEriED 
LVHFSVSF^PSRGYGVMEEIIWQGKLCDVTLKIGDHKFSAHRIV ! 
LAAS I PYFHAMFTNDMMECKQDE IVMQGMDPSALEALINFAYNG 
NLAI DQQNVQS LLMGAS FLQLQS I KDACCTFLRERLHPKNCLGV 
RQF AETMMCAVijYDAANSi* IHQHFVKVSMSEEFIjA1jPI^DVI^L» 
VSRDEI^KSEEQVFEAAIAWVRYDREQRGTFL\RNIX)SNIP^ 
FCRPQFJ^DRVQQDDLVRCCHKCRDLVDEAXDYLrWPERRPHIiP 
AFRTR PRCCTS I AGLI Y AVGGLN S AGDSLNWEVFDP IANCWER 
CRPMTTARSRVGVAWNGIiIiYAIGGYDGQLRI»STVQAYNTETDT 
WTRVGSMNSKR5AMGTVVIjDGQI YV CGGYDGNSSj^>SSVETYS pe 
TDKWTWT SMS SNRS AA\ G VTVFEGR I YVSGGHDGLQ I FS S VEH 
YNHHTATWHPAAGP1LNKRCRHGAAS LGSKMFVCGGYDGSGFLS 1 
AEMYSSV\ADQWCLIVPM\HTRR \srvslggpavgrlyavwg vt 
TGQSNL\SS VGDVIjTPETDOJTFM \ APMACHEGGVGVG C I PLI/T 
I 


5429 


828 


202 


RREDALSSEGCIjWPSESTVSGNGIPEPQVYAPPRPTDRLAVPPF 
AQRERFHRFQPTYPYLQHEIDLPPTISLiSDGEEPPPYQGPCTIjQ 
LRDPEQQLEL.NRESVRAPPNRTIFDSDU4DSAR1iGGPCPPSSNS 
g1satcygsggrmegppp \tyse v ighy pgssfqhqqssgppsl 
LEGTRLHHTHIAPLESAAIWSKEKDKQKGHPL 


5430 


441 


! 1507 


QKRRKRRRKKI MKTIQPKMHNS ISWAI FTGLAALC^FQGVPVRS 
GDAT FPKAMDNVTVRQGE SATLRCT I DNRVTRVAWLNRST I LYA 
GNDXWCLDPR VVLL SOTQTQYS I EIQNVDVYDEG P YTCSVQTDN 
HPKTSRVHLIVQVSPKIVEISSDISINEGNNISLTCIATGRPEP 
TVTWRHI SPKAVGFVSEDBYLEI QGITREQSGDYECS ASNDV\A 
APV\VRRVKVTVNYPPYISEAKGTGVPVGQKGTLQCEASAVPSA 
EFQWYKDDKRL I /EGKKGVKVENRPFLS KLI FFNVSEHDYGN YT 
CVASNKLGHTNAS IMI^FGPGAVSEVSNGTSPJU^CVWLLPLLVL 


543X 


2 


1312 


AAAAPGSRRRR PLPDRPHMAHG YEAP PPPAPRSPAWRARSXP V \ 
LPGITINP \TIAEG P S P \TS EGAS EANLVDLQKKLEELELDEQQ 
KKRLEAFLTQKAKVGE^iKDDDFERISBLGAGMGGVVTKVQHRPS 
GLIMARKLIHLEIKPAXRNQI IRELQVJLHECNSPYIVGFYGAFY 
SDGEIS I CMEHKDGGSLDQVLKE AKR I PEE X LGKVS IAVLRGLA 
YLREKHQ XMHRDVKPSWILVNSRGEIKLCDFGVSGQL I DSMANS 
FVGTRS YMAPERIjQ^THYSVQSD Z WSKGLS LVELAVGR YP IP PP 
DAKELEAI FGRPWDGE EGE PHS I S PRPRP PGRPVSGHGMDSRP 
AMAIFELLDYIVNEPPPKLPNGVFTPDFQEFVNKCLIKNPAERA 
DLKMLTNHTF I KRSE VE EVDFAG WLCKTLRLNOPGTPTRTA V 


5432 


2 


1312 | 


AAAAPGSRRRRPIiPDRPHMAHGYEAPPPPAPRSPAWRARSKPV\ 
LPGI TINP \ T I AEG PS P\TSBGAS EANLVDLQKKLEELELDEQQ. 
KKRLEAFLTQKAKVGELKDDDFERISEXGAGNGGWTKVQHRPS 
GLIMARKLIHLEIKPAIRNQI IRELQVLHECNSPYIVGFYGAFY 
SDGE ISI CMEHMDGGSLDQVLKEAKRI PEE ILGKVS IAVLRGLA 
YLREKHQ IMHRDVKPSNTLVNSRGEI KIiCDFGVSGQLI DSMANS 
FVGTRSYMAPERLQGTHYS VQSDI WSMGLSLVELAVGRYP I PPP 
DAKELEAI FGRPWDGEEGEPHS I SPRPRP PGRPVSGHGMDSRP 
AMAI FET iT»D Y I VNEP P PKLPNGVFTPDFQEFVNKCL I KN PAERA 
DLlMLTNOTFIXRSEVEEVDFAGWLCKTIiRLNQPGTPTRTAV 


5433 


360 


1885 


SVQEDKVGFEDPIiHLCSWRARACPCTWPHC/CTGLLECLGFAGV 
LFGWPSLVFVFKNEDYFKDLCGPDAG P IGNATGQADCKAQDERF 
SLIFTLGSFM23NFMTFPTG YI FDRFKTTVARLIAIFFYTTATLI 
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1 S£ Q 
1 TD 

| NO: 


| Predicted 
I beginning 
1 nucleotide 
I location 
J corresponding 

to first 
1 amino acid 
1 residue of 
I amino acid 
| sequence 


j Predicted end 
j nucleotide 
I location 
I corresponding 
I to first 

amino acid 
j residue of 

amino acid 
1 sequence 


1 Amino acid segment containing signal peptide 
i (A^Alanine, C-Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F« Phenyl alanine, G=Glycine, 
H»Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glut amine, R=Arginine, 
S« Serine, T=Threon±ne, V=Valine, 
W=Tryptophan, Y=Tyrosine, Xt=UnJcnown, *=Stop 
Codon, /=spossible nucleotide deletion 
\=possible nucleotide insertion) 








IAFTSAGSAVXjLFLAMPMLTIGGILFIjITIILQIGNIjFGQHRSTI 
ITI*YNGAFDSSSAVF1.I I KLL YE KG I SLR / VbLKLKLCLQ YIAC 
STHFPPDAPGAHPIPTAPQZX3LWPVPWEWHHKGREX3 /QQLSMXT 
GS YSQRSS FQRRKR PQGOGRS RNSAPSGATL/CSRRF A.WWT ,VWT. 

SVIQLWHYLF1GTI»NSI*LTNMAGG]»I^VSTYTNAFAFTQFGVI* 
CAPWNGLLMDRLKQKYQKEARKTGSST1^VALCSWPSIJU>TSL 
LCLGFALCASVPILPLQYLTF I LQVTSRSFLYGSNAAFLTIaAFP 
SEHFGKLFGLVMALSAWSIiLQFP I FTLIKGSLQNDPFYVNVMF 
MliAlLLTFFHPFLVYRBCRTWKESPSAXA 


5434 


66 


652 


R YAALI I SLI QHKLLWRNQHCS RCVT MS PAQSAGIiNWIj F / G SGK 
HGPFLGCSQYPACTYVRPLKSSADGHIVKVLEGQVCPACGANLV 
LRQGRFGMFI G CINYPECEHTBTjI DKPDETA I TCPQCRTGHItVQ 

RRSRYGKTFHSCDRYPECQFA1NFKPIAGBCPECHYPLL1EKKT 
AQGVXHFCASKQCGKPVSAE 


5435 

j 


4 704 


| 1597 

• 


1 PGDSSQRI»AEMSNAKERKHAKKMRWQPTOTTLSSGFVADRGVKH 
J HSGGEKPFQAQKQEPHPGTSRQRQTRVNPHSLPDPEVNEQSSSK 
j GMFRKKGG WKAGPEGTSQE I PKY I TASTFAQ ARAAE I S AM L.KAV 
TQKS SNSLV FO/TI*P RHMRRJIAMSHWVKRJjPRRJjQE IAQKEAE KA 
VHQK^EHSKNKCHKARRCHMNRTLEFNRRQKKNIWLETraiWIAK 
R FHMVKKWG YCLGERPTVKS HRACYRAMTNRCLLQDIjS YYCCLE 
LKGKHEEILKALSGMCNIDTGLTFAAVHCLSGKRQGSIjVIjYRVN 
KYPRBMLGPVTFIWKSQRTPGDPSESRQLWIWLHPT1.KQDILEE 
IKAACQCVEPI KSAVCIADPLPTPSQEKSQTELPDEKIGKKRKR 
KDIX3EN7AKPIKKIIGDGTRDPCLPYSWISPTTGI I ISDI/TMEMN 
RFRI>IGPLSHSIIjTEAIKAASVHTVGEDTEETPHRWWIETCKKP 

ds vslhcrqeaifellggi tspaei pagtilgltvgdprinlpq 
kkskalpnpekcxjdnekvrqijllegvpvecths fi wmqdi cksv 

i i J^a.AoUU1}1iNRMRSEIJjV?GS PILIjIQQPGKV 

i TGEI>RI^WGSGWDVLI>PKGWGMAFWI PFI YRGVRVGGDKESAVH 

| SQYKRSPNVPGDFPDCPAGMLFAEEQAKNLLEKYKRRPPAKRPN 

YVlObGTIjAPFCCPWEQLTQDWESRVQAYEEPSVASSPNGKESDb 

RRSEVPCAPMPKKTHQPSDEVGTS 1 EHPREAEEVMDAGCQESAG 

PERI TDQEASENHVAATGSHLCVLRSRKLLKQLSAWCGPSS EDS 

RGGRRAPGRGQCGLTREACLSILGHFPRALVWVSLSLLSKGSPE 

PHTMI CVPAKEDFLQLHEDWH YCX3PQES KHSDP FRS KI DKQKEK 

KKREKRQKP\GRASSIX5PAGEEPVAGQEAI*TIX3IiWSGPL»PRVTIj 
HGS RTI»LiGFVTOG DFSM A Vfi fVJP AT /JPV*? r/TYTT J.DMT c»<"> & AO 

RGLVLIjRPPASLQYRPARIAIEV 


f 5436 ( 


1781 j 


635 6 


ASDS I PWSEARTTRKLAQRGCQWSI>PERMPLVVFCGLP YSGKSR 
RAEEIJi VALAAEGRA VYVVDDAAVTjGAED PAVYGDS AREKALRG 
ALRASVERPiSRHDWIIJDSLNYIKGFRYELYXCLARAARTPLC 
LVYOHiPGGPIAGPQVAGANENPGRNVSVSWRPRAEEDGRAQAA 
GSSVLRELHTADSWKGSAQADVPKELERE5SGAAES PALVTPD 

SEKSAKHGSGAFYS PELLEALTLRFEAJPDSRNRWDRP LFTLVGI* 
EEPLPLAGIRSALFENRAPppHOSTOSOPLASGSFTjHOLrXDVTS 
Q VliAGIJ^EAQKS AVPGI)LLTL PGTTKFT iR FTR PLT^fAEI. S Rl> RR 
Q FI S YTTKMHPNNENIi PQLANM FLQ YLSQSLH 


5437 


739 


. 1672 j 


CQEAASEFGGPLHTPAMFLRRLGGWLPRPWGRRKPMRPDPPYPE 
PRRVDSSSENSGSDWDSAPETMEDVGHPKTKDSGALRVSRAASE 
PSKEEP<JVEQLGSKRMDSI,KWDQPISSTQESGRIiEAGGASPKLR 
WDHVDSGGTRRPGVSPEGGL\GVPGPGAPLEKPGRREKbL»GWIiR 
GEPGAPSRYIiGGPEECLQISTNLTLHI^ELliASAIJJ^CSRPLR 
AALDTIGLRGPI/SLWI^GLI^FX^ 

CLFGIaLQALVLAVS LREPNGD EAATDWB S egle REGEEQRG d pg 
KGL 


5438 


2443 \ 


1152 j 


TKPRKRRHQPASQRQRPWSSDSTGDLLARGKGRKEENKGSDRVS 

IAPPSLRRPMMC<)SEARCGPELRAAKWLHFPQL^ 

MSRPALKlJ^WPLTVLYYIiPFGAliRPLSRVGWRPVSRVAXiYKS 

VPTRIiLSRAWGRJ^QVELPHWLRRpVYSLYIWTFGVNMKEAAVE 

DLHHYRNLSEFFRRKLKPOARPVCGI*HSVISPSOGRILNFGQVK 

NCEVEQVKGVTYSLESFXGPRMCTEDIiPFPPAASCDS FKNQLVT 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid j 
sequence 


Amino acid segment containing signal peptide 
(AsAlanine, C=Cysteine, D=Aspartic Acid, E« 
Glutamic Acid, P= Phenylalanine , G«Glycine, 
H=Histidme, i»IsoXeucine, K=»iiysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P= Proline, Q=Glutamine , R=Arginine, 
S=Serine, T=Threonine, V-Valine, 
w=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\spossible nucleotide insertion) 








REGNSI>YHCVIYLAPGDYHCIFHSPTDWTVSHRRHFPGSLMSVtJP 
GMARWX KEIiFCHNER WLTGDWKHG FPS LTAVGAT\ NWGS IRI Y 
FDRDLHTttSPRHSKGSYNDFSFVTHTNREGVPMAL^^ 
FNXX3STI VIjI fc \fciAI* isU rnr KjLxiS. x Vj\j jvi rt r \j tvMJjva o j-» 


5439 


2443 


1152 


TKPRKRRHQPASQRORPWSSDSTGDLLARGKGRKEENKGSDRVS 

IAPPSUIRPMMCQSEAROGPELRAAJCNI^PQIaALRRRL^ 

MSRPALKLRSWPLTVLYYLLPFGAI>RPLiSRVGWRPVSRVALYKS 

, rnmn T T oA^rinnT xwtror TiTTWT DDDUVCI.VTUTP/lVWMJfPl&VR 

VPTJU^LtSRAWGRIiNQVBl^HWJUKKlrV i5uilwirovsit l, m£iuvvD 

dlhhyrnlsepfrrklkpqarpvcglhsvispsdgrilnfgqvk 
nceveqvkgvtysles flgprmctedlpfppaascdsfknqlvt 

REGNELiYH CV I YliAPGDYHCFHSPTDWTVSHRRH FPGSLMS VNP 
GMARW I KELiFCHN ER WLTGDWKHGF FS LTAVGAT \ NWGS I R I Y 
FDRDLHTNS PRHS KGS YNDFSFVTHTNREGVPMALRGEHLG /QS 
FNLGSTI VhX FEAPKDFNFQL.KTGQKIRFGEALGSI» 


5440 


693 


253 


EP I P VTPDHRIiVTMTH IV \QTFSPVNS \GQPPNYEMUCEEQEVA 
MI^APHNPAPPMSWIHIRSETSVPDHVWSLFHTLFMNTCCLG 
FIAFAY S VKS RDRKMVGDVTGAQAYASTAKCLN I W AL. I LG I FMT 

ILLI 1 1 PVIiWQAQR 


5441 


2 


2054 


"CRDGGKNGFMVSPMKPLE I KTQCSGPRMDPKICPADPAFFS FIN 
NSDLWVANIETGEERRLTFCHQGIiSNVLDDPKSAGVATFVIQEE 
FDRFTGYWWCPTASWEGSEGLKTLRILYEEVDESEVEVIHVPSP 
ALEERKTDSYRYPRTGSKNPKIAIiKIAEFQTDSQGKIVSTOEKE 
L VQ P FS S LFP KYE Y I ARAGWT RDG KYAWAM FliDR PQQ WLQL VLLi 
PPALPIPSTENEEOXRIiASARAVPRKVQPYVVYEEVTNVWINVH 
DIFYPFPQSEGFJ)ELCFIiRANECKTGFC«LYKVTAV3 J KSQGYDW 
SEPFS PGEGEQSL7NAIWVNEETKLVYFQGTKDTPLEHHLYWS 
YEAAGE I VRLTTPG FSHSCSMSQNFDMFVS HYS S VS TPPCVKVY 
KLSGPDDDPLHKQPRFWASMMEAAKI FHFHTRSDVRL.YGMIYKP 
HALQPGKKHPTVLFVYGGPQVQLVNNSFKG I KYLRLNTIASU3Y 
AWVIDGRGSCQRGLRFEGALKNQMGQVEI EDQVEGLQFVAEKY 
GFIDI^RVAIHGWSYGGFI^LMGIjIHKPQVFKVAIAGAPVTVWM 
AYDTG YTER YMD VPENNOTGYEAG S VAliHVE KLPNE PNRLLII»H 
G fldenvh f FHTNFI>VSQL» I RAGKP y qlqvalp p vs PQ I YPNER • 
HSIRCPESGEHYEVTLXHFLQEYL 


5442 


1 


3474 


"CGQRSRRRSPDMPEAKPAAKKAPKGKDAPKGAPKEAPPKEAPAE 
APKEAPPEDQS PTAEEPTGVFLKKPDSV3 V ETGKDAVWAKVNG 
KELPDKPTIKWPKGKWIiEIiGSKSGARFSFKESHNSASNVYTVEIi 
HIGKVVLGDRGxYRLEVKAKDTCDS CGFNI DVEAPRQDASGQSIi 
ESFKRTSEKKSDTAGEIiDFSGLLKKREVVEEEiaaaCKKDDDDLG 
IPPEIKELLKGAKKSEYEIOAFQYGITDLRGMUKRLKKAKVEVK 
KSAAFa'iUCLDPAYQVDRGNKlKLMVSISDPDLTLKWPKNGQEIK 
PSSKYVFENVGKKRIL.TINKCTLADDAAYEVAVKDEKCFTELFV 
KEPPVLIVTPLE1DQQVFVGDRVEMAVEVSEEGAQVMWMK1WVEL 
TREDSFKAR YRFKKDGKRHILI FSD WQEDRGRYQ VITNGGQCE 
AELIVEEKQLEVI^DIADIjTWASEQAVFKCEVSDEKVTGKWYK 
NGVEVRPS KRITISHVGRFHKLVIDDVRPEDEGDYTFVPDGYAL 

gslsaklnfleikveyvpkq\eppkiplgfasggktsenad/iv 
wagn^rldv\sitgeapspfat\wlkg\devftttegrtrie 
krvdcssfviesaqredegrytikvtnpigbdvasiflqvvdvp 

DPPEAVRITSVGFXWAlLVVTCPPMYDGGKPVTGYliVKRKKKGSQ 
P^KT,TsTFEVFTETTYESTK>lIEGILYEMRVFAVNAIGVSQPSMN 
TKPFMPlAPTSEPLKLIVEPViDriTTLKWRPPNRIGAGGIDGY 
LVEYCLEGSEEWVPANTEPVERCX5FTVKNLPTGARILFT?WGVN 
IAGRSEPATLAQPVTI RE IAEPPKI R L P RHLRQTYI RKVGEQLN 
LWPFQGKPR PQVVWTKGGAPLDTSRVHVRTSD FDTVFFVROAA 
RSDSGE YEI*S VQ IENMKDTATIR I RWE KAGP P INVMVKEVWGT 
NALVE^WQAPKDDGNSEIMGYFVQKADBCKTMEWF^TVYERNRHTSC 
TVSDLIVGNEYYFRVYTENICGLSDSPGVSKNTARILKTGITFK 
PPEYKEHDFRMAPKFl*TPI»IDRWVAGYSAAIiNCAVRGHPKPKV 
VWMKNKME J RED P KFL I TtfYQG VLTLN I R RPS P FDAG TYTCRAV 
NELGEALAECKLEVRVPQ 
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ID 
NO: 


Predicted 
beginning 
nucl eotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corre sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
[A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidlne, I=lsoleucine, K=iysine, 
I>=l«eucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glut amine , R=Argin±ne, 
S=Serine, T=» Threonine , VsValinc, 
W-Tryptophan, Y=Tyrosine, X=Un)cnown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=spossible nucleotide insertion) 


5443 


66 


1003 


SRGQIiDAGQS S EQHGGNRQPEQS RS RSS SSSSS PRRSRSAAE PA 
MAI*SMPLNGI*KEEDKEPLI ELFVKAGSDGES IGNCPFS QRLFM I 
LWLKGWFSVTTVDLKRKPADLQNIAPGTHPPFI TFNSEVKTDV 
NKI EE FLEEVLCPP KYIJCLS PKHPE SNTAGMDI FAXFS AYIKNS 
RPEAKEAIiERGI»IiKTLQ KIiDBYLNS PLFDEIDENSMED I KFSTR 
KFLDGNEKTToADCNIiPKI^VKVVA 

LTNAYSRDEFTNTCPSDKEVEI \AYSDVAKRLHQVKSRLLKEVS 
FMSSP 


5444 


2 


344 


SGPIGVl!GAQMAKWIiRDYI*SFGGRRPPPQPPTPDYTESDI£*RAY 
RAQ KNLDFEDP Y* DSESRLEPDPAG PGDS KNPGDAKYG S P KHRL 
I KVEAADMARAKAX.LGG PGEELEADTEYLDP FDAQPHPAP PDDG 
YME P YDAQWVMSELPGRGVQLYDTP YEEQDPETADGPPSGQKPR 
QSRMPQEDBRPADEYDQPWEWKKDHISRAFAVQFDSPEWERTPG 
SAXELRRPPPRSPQPAERVDPALPLEKQPVTFHGPLNRADAESMj 
SLCKEGSYLVRI^ETNPQDCSLSIiRSSQGFLHLKFARTRENQVV 
LGQHS GPFPS VPELVLHYSSR PI* P VQGAEHLALLYP WTQTP 1 * Q 
* PDWGDRRPNGQVATGLPELWGAEAPSAAAHPGLHRERHPEGLP 
RAE KPGLRG PLLGL.RE PLGAGPRG P WGJjQEPRRCQWFSQAPAH 
QGGGCGYGQSQGPSGRPRGGAGSRH 


5445 


2364 


486 


ILSRGFLGS VEI CIQLPLPASEP VI*I*IiTWARRRWRETRSRREPT 
TLRAQSVCPWW I * ETRMNRS I PVEVDESEPYPSQLLKP I PEYSP 
EBESEPPAPN1RNMAPNSIjSAPTM1»HNSSGDFSQAHST1*KLANH 
QRPVSRQVTCIjRTQVIiE DS EDS FCRRHPGI»G KAFPSGC S AVSE P 
AS ES WGALPAEHGFS FMEKRNOWI>VSOLS AAS PDTGHD SDKS D 
QSLPNASADSI^GSQEMVQRPQPHRNRAGLDLPTIDTGYDSQPQ 
DVLGIRQLERPbPLTSVCYPQDLPRPLRSREFPQFEPQRYPACA 
QMLPPNIiSPHAPWNYHYHCPGSPDHQVPYGHDYPRAAYQQVIQP 
Ai»PGQPIiPGAS VRGliHP VQKVI I*NYPS P WDQEERPAQRDCS FPG 
LPRHQDQPHHQPPNRAGAPGESIjBCPAEIiRPQVPQPPS PAAVPR 
PPSNPPARGTLKTSNLPEELRKVFITYSMDTAME\nnCFVNFLLV 
NGFQTAIDI FEDRIRG IDI I KWMER Y1»RDKTVMI I VAIS PKYKQ 
DVEGAESQIJDEDEHGLH1TCYIHRMMQIEFIKQGSMNFRFIPVLF 
PNAKKEHVPTWLQNTHVYS WPKNKKN I LLRLLREEEYVAPPRGP 
LPTLQWPL 


5446 


972 


161 


SS WS WCTGRMR KTRIiWGLl* WMIiFVSEIiRAATKIJTE EKYEDKEGQ 
TLDV KCD YTLEKFASSQKAWQI IRDGEMPKTIiACTERPSKNSHP 
VQVGRI I LED YHDHGLLRVRMVNLQVEDSGLYQCVTYQP PKEPH 
MLFDRIRLWTKG FSGTPGSNENSTQNVYK1 PPV1TKALCPLYT 
TPRTVTQAPPKSTADVSTPDSEHTLTNVTDI IRVPVFNl VI UjA 
GGFLSKSLVFSVLFAVTLRS FVP * AHEPTRMSSDFQPHPSGSC3V 
KGGGRR 


5447 


207 


617 


MTARTLS LMASLVAYDDS DS EAETEHAGSFNATGQQKDTSGVAR 
PPGODFASGTLDVPKAGAQPTKHGSCEDPGGYRiPLAQLGRSDR 
GSCPSQRLQWPGKEPQVTFPIKEPSCSSIiWTSHVPASHMPLAAA 
RFKQVKLSROTPKSSFHAQSESEWGia^GSSEXJKKKCEDC^rVPY 
TPRRLRQ RQALS TE^TGKGKD VEPQGP PAGRAPAPL YVG PGVS EF 
IQPYLNSHYKETTVPRKVLFm*RGHRGPVNTIQWCPVDSKSJ©lL 
LSTSMDKTFK^/VmAVDSGHCI^TYSLHTEAVPJ^^WAPCGRRIIi 
SGGFDFAIiHLTDLETGTXJLPSGRSDFRITTLKFHPXDHN I FLCG 
GFSSEMKAWDIRTGKVMRSYKATIQQTIjDILFLREGSEFI,SSTD 
ASTRDSADRT1 IAWDFRTSAKISNQIFHERFTCPSI*ALHPREPV 
FljAQTKG^IiALFSTVWPYRWSRllRRYEGHKVEGYSVGCECS PG 
GDLLVTG S ADGRVLMYSFRTASRACTLOC5OTQ AC VGTTYH PVLP 
S VLAT CS WGGDMKI WH * AFHVTLS LG EA I G DLAP ARG YSG PGRS L 
KSPSPSKSLLVLLCGRAMFQPATCPWQIiPAIiSK 


5448 


194 


1833 


MASKVTDAI VWYQKKIGAYDQQI WEKS VEQREI KGLRNKPKXTA 
HVKPDLIDVDLVRGSAFAKAKPESPWTSLTTKG I VRWFFPF FF 
RWWI^VTSKVTFFWIAVLYI^VAAIVLFCSTSSPHSIPLTEVI 
GPI^^IJjGTVHCQIVSTRTPKPPI^TGGKRRRKIaRKAAHLEV 
HREGDGSSTTDNTQEGAVQiraGTSTSHSVGTVFRDLWHAAFFLS 
GSKKAKNS XDKS 7ETDNG YVSLDGKKTVKSGEDG IQNHEPQCET 
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SEQ 1 
ID 

NO: | 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corre sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A-Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Pnenylalanine, G=*Glycine, 
H=Histidine, I-Isoleucine , K-Lysine, 
L=Leucine, {^Methionine, N-Asparagine , 
P=Proline, Q-Glutamine , R=Arginine, 
S-Serine, T= Threonine, V=Valine, 
{^Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








"IRPEETAWNTCTLRNGPSKDTQRTITNVSDEVSSEEGPETGYSL 
RRHVDRTSEGVLRNRKSHHYKKHYPNEDAPKSGTSCSSRCSSSR 
QDSESARPESETEDVLWEDL.LHCAECHSSCTSETDVENHQINPC 
VKKEYRDDPFHQSHLP«IiHSSEIPGLEKISAIVVrEGNI)C3CKADMS 
VLB I S GMI MNR VNSH I PG I G YQI FGNAVSL I LG Ii TP F VFRLS QR 
TDLEQLTAHSASELYVIAFGSNEDVIVLSMV1 ISFWRVSLVWI 
FPFLLCVAERTYKQVGIM * TS EX»V1J?NRKSHHYKKHY PNEDAPK 
SGTSCSSRCSSSRQDSESARP ESETEDVX.WEDLLHCAE CHS SCT 
SETDVENHQINPCVKKEYRDDPFHQSHLPWIiHSSHPGLEKISAI 
VWEGNDCK3(ADMSV1J3ISGMIMNRVNSHIPGIGYQIFGNAVSLI 
LGLTPFVFRI>SQATDLEQIiTAHSASELYVIAFGSNEDVIVLSMV 
I IS FWR VSIiVW I FFFLLCVAERTYKQVGIM 


5449 j 


194 

• 


1833 


MASKVTDAIWYQKKIGAYDQQIWEKSVEQREI KGLRNKPKKTA 
HVKP DL I D VDLVRG SAFAKAXP ES P WTS LTTKG I VR WF F P F FF 
RWWLQVTS KVI FFWLLVL YLLQVAAI VLFCSTSSPHS IPLTEVT 
GPIWLMT,TJ^TYHCQIVSTRTPKPPI»STGGKRRRKL,RKAAHLEV 

HREGDG S S TTDNTQ EGAVQNHGTST SHS VG TVFRDLWHAAF F LS 
GSKKAKNS IDKSTETDNGYVSLDGKKTVKSGEDG I QNHEPQC3T 
IRPEETAWNTGTLRNGPS KDTQRTITNVSDEVS S EEGPETG YS L 
t> Din mn t c »nrr dud if CTTH vv VWY PNEH APKSG TS CSSRCS S S R 
QDSESARPESETEDVLWEDI.LHCABCHSSCTSETDVENHQINPC 
VKKE YRDDPFHQSHLPWLKSS HPGLE KI S AI VWEGNDCKKADMS 
VLEISGMIMNRVNSHIPGIGYQIFGNAVSLILGLTPFVFRLSQA 

TDLEQLTAHSAS EliYVIAFGSNEDV I VLS MVI I S FWRVS LVWI 
FFFLLCVAERTYKQVGIM* TSEGVLRNRKSHHYKKHYPNEDAPK 
SGTSCS S RCSSSRQDS ESARPESETED VLWEDLLHCAECHS S CT 
SETDVENHQINPCVKKEYRDDPFHQSHLPWLHSSHPGLEKI SAI 
VWEGNDCKKADMSVLE I SGMIMNRVNS H I PGIGYQI FGNAVSLI 
LGLTPFVFRLSQATDLEQLTAHSASELYVTAFGSNSDVIVLSMV 
1 1 SFVVRVS LVWI FFFLLCVAERTYKQVGIM 


5450 


] 8136 


1242 


~GQQFASFFG*NHPE\nrVAMALTDIDLQLQFSMSQPEAIJ J LIiAAG 
PADHLLLQLYSGHLQVRLVLGQEELRLQTPAETLLSDS I PHTW 
LTWEG WATLSVDG FLNAS SAVPGAPLE VP YGLFVGGTGTLGL P 
YLRGTSRPLRGCLHAATLNGRSLLRPLTPDVHEGCAEEFSASDD 
VAIiGFSGPHSIiAAFPAWGTQDEGTIiEFTLTTQSRQAPLAFQAGG 
RRGDFIYVT3IFEGHLRAVVEKGQGTVLLHNSVPVADGQPHEVSV 

HINAHRLE I S VDQYPTHTSNRG VLSYLE PRGSIiLLGGLDAEASR 
H1X2EHR1^LTPEATNASIXGCMEDLSWGQRRGLRFALLTRNMA 

AGCRLEEEEYEDDAYGKYEAFS TLAPBAWPAMELPE PCVPE PGL 
PPVFANFTQLLT IS PLWAEGGTAWLEWRHVQPTLDLMKAELRK 
SQVLFS VTRGAJTyGELEIJ>I UIAQARKMFTLLDVVNRKARF I HD 
GSFXrrSDQLVLEVSVTARVPMPSCLRRGQTYLLP IQVNPVNDP ? 
HI IFPHGSLMVILEHTQKPLGPEVFGAYDPDSACEGLTFQVLGT 
SSGLPVERRDQPGEPATEFSCRELEAGSLVYVHCGG PAQDLTFR 

VSDGLQAS P PATLKWAI RPAIQ I HRS TGLRLAQGSAMP ILPAN 
IjSVETNAVGODVSVLFRVTGAIjOFGEXQKHSTX^VEGAEWWATQ 
AFHQRDVEGGRVRYLSTDPQHHAYDTVENLALEVQVGQE ILSNL 
S FPVTI QRATVWMIJ^EPLHTQNTCX2ETLTTAHLEATLEEAG PS 
PPTFHYEWQAPRKGNLQLQGTRLSDGQGFTQDD IQAGRVTYGA 
TARAS EAVEDTFRFR VTAPP Y FS PLYTFP I H I GGD PDAP VLTNV 
LLVVPEGG EG VLSADHLFVKSI^S AS YLYEVMERPRLGRLAWRG 
TQDKTTMVTS FTNEDLLRGRLVYQHDDSETTEDD I PFVATRQGE 
S S GDMAWEEVRGVFRVAI QP VNDHAPVQT I SRIFHVARGGRRLL 
TTDDVAFSDADSGFADAQLVLTRKDLLFGS I VAVDEPTRPI YRF 
TQEDLR KRR VLFVHS G ADRG W I QLQ VSDGQHQATALLEVQAS E P 
YIiRVANGSSLVVP0GGCX5TIDTAVljHIiDTNLD 1 RSGDEVHYH VT 
AGPRWGQLVRAGQPATAFSQQDLI^AVLYSHNGSLSPEDTMAF 
SVEAGPVHTmTLQVTlAI^PIJ^I^VRHKKIWFC^EAAEI 
RPJ5QLEAAQEAVPPADI VFSV1CSPPSAGYLVMVSRGALADEP PS 

LDPVQSFS QEAVDTGRVL YLHS R PEAWSDAFS LDVASGLGAPLE 
GVLVELEVLPAAIPLEAQNFSVPEGGSLTLAPPLLRVSGPYFPT 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
i corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
cor re s ponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I-Isoleucine, K= Lysine, 
L= Leucine , M«Methionine, N=Asparagine, 
P- Proline, Q=Glut amine, R=Arginine, 
S^Serine, T= Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown , *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) i 








LLGLS LOVLEP PQHG PLQKEDG PQARTLS AFS WRMVEEQLI RYV 
HDGSETLTDS FVLMANASEMDRQSHPVAFTVTVLPVNDQPP I LT 
TNTGLQMWEGATAP I PAEALRSTDGDSGSEDLVYTIEQPSNGRV 
VLRGAPGTEVRS FTQAQLIX^LVLFSHRGTLDGGFPFRLSDGEH 
TSPGHFFRVTAQKQVLLSLKGSQTLTVCPGSVQPLSSQTLRASS 
SAGTD PQLLLYR WRGPQLGRLFHAQQDSTGEALVN FTQAEVYA 
GNILYEHEMPPEPFWEAHDTLELQLSS PPAR0VAATLAVAVSFE | 
AACPQRPSHL WKNRGLWVPEGQRARI TVAALDASNLLAS VPS PQ 
RSEHDVLFQVl^FPSRGQLJjVSEEPLHAGQPHFLQSQLAAGQLV 
Y AHGGGGTQQDG FHFRAHLQGPAGAS VAG PQTS EAFAITVRDVN 
ERPPQPQASVPLRLTRGSRAPISRAQLSWDPDSAPGEIEYEVQ 
RAPHNGFLSLVGGGIiGPVTRFTOADVDSGRLAFVANGSSVAGI F 
QLSMSDGASP PLPMSLAVDI LPSAI EVQLRAPLEVPQALGRS SL 
SOQQLRVVSDREEPEAAYRLIQGPOYGHLLVGGRPTSAFSQFQI 
DQGEWFAFTNFS S SHDHFR\HJ\LARG VNASAVVinTxv^ALLHV 
WAGGPWPQGATLRIJ5PTVLDAGELANRTGSVPRFRLLEGPRHGR 
VVRVPRARTEPGGSQLVEQFTQQDLEDGRLGLEVGRPEGRAPGP 
AGDSLTLELWAQGVP PAVAS LDFATEP YNAARP YS VALLS VP EA 
ARTEAG KP ES S TPTGEPG PKAS S P E PA VAKGG FLS FLEANMFS V 
1 1 P MCL VLLLLAL I LPLLFYL RKRNKTG KHD VQ VLTA K P RNG LA 
GDTBTFRKVEPGQAI PLTAVPGQGPPPGGQPDPELLQFCRTPNP 
ALKNGQYWV 


54S1 


1 


2274 

• 


RDS S EQGRTGDTLGRPS ACMDALKP P CLWRN HERG KKDRU S CGR 
KNSEPGSPHSLEALRDAAPSQGLNFLLLPTKMLFI FNFLFSPLP 
TPALICILTFGAAI FLWLI TRPQP VLPLLDLNNQS VG IEGGAR K 
GVSQKNITOLTSCCFSDAKTMYEVFQRGLAVSDNGPCLGYRKPNQ 
PYRWLS YKQVS DRAE YLGS CLLHKG YKS SPDQFVG I FAQNRPSW 
IISELACYTYSMVAVPLYDTLGPEAIVH IVNKADIAMVICDTPQ 
KALVLI GNVEKG FTPSLKVX I LMDP FDDDLKQRGE KSGI E ILS L 
YDAENLGKEHFRKPVPPS PEDLSVICFTSGTTGDPXGAMITHQN 
IVSllAAAFI^CVEl^YEPTPDDVAISYLPLAHMFERIVQAVVYS 
CGARVGFFQGD I RLI^DMKTLKPTLFP AVPRLLNRI YDKVQNE 
AKTPLKKFLLKLAVSSKFKELQKGI IRHDSFWDKLIFAKIQDSL 
GGRVRVI VTGAAPM STS VMTFFRAAMGCQVYEAYGQTECTGG CT 
FTLPGDWTSGHVGVPLACNYVKT iKOVADMNYFTVNNEGEVCI KG 
•JWFKGYLKDPEKTQEALDSDGWLHTGDIGRWLPNGTLKIIDRK 
KNIFiCLAQGEYIAPEiaENIYNRSQPVLQIFVHGESLRSSLVGV 
VVPDTDVLPSFAAXLG VKGS FEEL»CQNQ VVRBAI LEDLQKIGKE 
SGLKT FEQVKAI FLHPEPFS IENGLLTPTLKAKRGELSKYFRTQ 
IDSLYEHIQD 


5452 


1333 


113 8 


SRVPSLCLSLSLSLSPSREPVAGAPGCGTAGPPAMATLWGGLLR 
IX5SLLSI*SCLALS VLLLAQLSDAAKNFEDVRCKC I CPP YKENSG 
HI YNKNI SQKDCDCLHWE PMPVRGPDVEAYCLRCECKYEERSS 
VTIKVTI 1 1 YLS IIX3LLLJ#YMVYLTLVEPILKRRLFGHAQLIQS 
DDD IGDHQPFANAHD VLARS RS RANVLNKVE YAQQRWKLQVQEQ 
RKSVFDRHWLS 


5453 


111 


1520 

• 


PS I PAAVPQSAPPE PHREETVTATATS Q VAQQPPAAAAPGEQAV 

AGPAPSTVPSSTSKDRPVSQPSLVGSKEEPPPARSGSGGGSAKE 

PQEERSQQQDDIEELETKAVGMSNDGRFLKFDIE IGRGSFKTVY 

KGU)TETTVEVAWCEIX5DRKLTKSERQRFKEEAEMLKGLQHPKI 

VRFYDSWESTVKGKKCIVLVTEIMTSGTLKTY^ 

RSWCRQlLKGLCFLHTRTPPIIHRDLKCDNIFrTGPTGSVKIGD 

LGIATLKRASFAKSVIGTPEFMAPE^EEKYDESVDVYAFGMCM 

LEMATSE YPYSECQNAAQI YRR VTSGVKPAS FDKVAI PEVKEI I 

EGCIRQNKDERYS I KDLLNHAFFQEETGVRVELAEEDDGEKI AI 

KLWLRIEDIKiOJKGKYKDNEAIEFSFDLERNVPEDVAQEMVESG 

YVCEGDHKTMAKAIKDRVSLI KRKREQRQL* 


5454 


111 


1520 


PSIPAAVPQSAPPEPHREETVTATATSQVAQQPPAAAAPGEOAV 

AG PAPSTVPS STS KDRP VSQPS LVGS KEEP PPARSGSGGGSAKE 
PQEF^SQQQDDIEELBTKAVGMSNDGRFLKFDIEIGRGSFKTVY 
KGLDTETTVKVAWCFaxQDRKLTKSERQRFK^FAEMLKGLQHPNI 
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* SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
Carre spondi ng 
to first 

cult XX. C? aCXU 

residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 

to first 
amino acid 

-»<••«» ^ -{ 1 1 of 

amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C= Cysteine, D=Aspartic Acid, K= 
Glutamic Acid, F~ Phenyl alanine, G=Glycine, 
H=His t idine , I I soleucine » K= Lysine , 
L«Leucine , M=Me t hionine , K=Asparagine , 
P=Proline, Q=G1 ut amine » R=Arginine, 
S=Serine, T= Threonine , V=Valine, 
H=Tryptophan, Y= Tyro sine, X=Unknown, +=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








VRFYDSW ES TV KGKKC I VI^VTELOTSGTLKTY LKRFKVMKI KVL 
RSWCRQILKGLQFLHTRTPPI1HRDLKCDNTFITGPTGSVKIGD 
LGLATLKRAS FAXSVTGTPEFMAPEMYEEKYDESVDVYAFGMCM 
LEMATSEYP YS ECQNAAQ I YRRVTSGVKPAS FDKVA1 PEVKEI I 
EGCIRQNXDER YS I KDLLNHAFFQEETGVRVELAEEDIDGEKIAI 
KLl^RIEDI KKLKGKYKDNEAIEFSFDLERNVPEDVAQEMVESG 
YVCEGDHKTMAJCAI KDRVS 1*1 KR KREQRQL * 


5455 


1359 


377 


LTMVSPATRKSLPKVKAMDFITSTAI LPLLFGCLGV FGLFRLLQ 
WVRGKAYLRNAVWI TGATSGI^KECAKVFYAAGAKLVLCGRNG 
GALEELIRELTASHATKVQTHKPYLVTFDLTDSGAIVAAAAEIL 
. QC FGYVD I LVNNAG I S Y RGTI MDTTVD VD KR VME TNY FG P VAL.T 
XALLPSMI XERQGHI VA ISSIQGKMS I PFRSA YAAS KHATQAF F 
DCIiRAEMEQY E I EVTV I S PG Y IHTNLSVNAI TAIXSSR YG VMDTT 
TAQGRS P VEVAQDVIAAVGXKKKD VI LADIiPSI^VYI»RTI*APG 
LFFSLMASRARKERKSKNS 


5456 


2 


2332 


CGAGLVAAG AVXjVLY PAS RAGE RTR V ?3S PAP S S LiPLHS PGACG 
TEVDMDPQRSPI*LEVKGNIEIiKRPl»IXAPSQI/PI»SGSRLKRRPD 
QMEDGLEPEKKRTRGLGATTKITTSHPRVPSLTTVPQTQGQTTA 
QKVSKKTGPRCSTAIATGIjKNQKPVPAVPVQKSGTSGVPPMAGG 
KJ<PSKRPAWDLKGQLCI)LtIAELKRCRERTQTLIX2ENQQLQDQLR 
DA0X2QVKAI^TERTTLEGHIAKVQ^ 

EBRLSTQEGI>VQELQKKQVELQE ERRGLMSQIiEEKERRLQTS EA 

< T n npAntnrn ot DrMTTTrTiAa 21 T .T Tt?T?'P'P , 'PT»W^3T.PMRI?'Pl>T»H>iryf ■ 
ALSSSQAEVASXjKQfc* A v 1 1 1 . i iiitvxjiiftJ-irivj xj^riiiirtrt^vjjriiN^ij 

QEI^GNIRVFCKVRPVLPGEPTPPPGLLliFPSGPGGPSDPPTRlj 
SLSRSDERRGTIjSGAPAPPTRHDFSFDRVFPPGSGQDEVFEEIA 
MLVQSALDGYPVCIFAYGQTGSGKTFTMEGGPGGDPQIiEGLIPR 
ALRHLFSVAQELSGQGVTTYS FVASYVE IYNETVRDLLATGTRKG 
QGGECEI RRAGPGSEEI>TVTNARWPVSCEKEVTjAL.LHLARQ>m 
AVARTAQNERSSR8HSVFQLQISGEHSSRGLQCGAPLSLVDI1AG 
SERLDPGLALGPGEPJ3U.RETQAINSSI*STIiGI*VXMALSNKESH 
VPYRNS KLTYLLQNSLG^SAKMI^FVNI SPLEENV3ESLNS IiRF 
ASKVEPS^FGTAQSNRKVJK^DPDLCVCVCVCVCV^CVCVCVP 
MSMYRVRGGRVAGGCFIGWRAPCPRAtK 


5457 


2 


1540 


" DDFVERRRWTRTTCLVRS P PHVPVCGHACSWNGGSLDP I»KGTPA 
TT.PQaPT?T .MPTCVKTer.RLDKENTGSWRS FSLNSEG AERMATTGTP 
TADRGDAAATDDPAARFQVQKHSWDGLRSI IHGSRXYSGLI VNK 
APHDFQFVQKTDESGPHSHRLYYLGMPYGSRENSLIiYSEIPKKV 
RKEALIJ^WKC^LDHFQATPHHGVYSREEEI^RERKKIjG VFG I 
TSYDFESESGLFLFQASNSIjFHCRDGGKNGFMVSPGPGCVSPMK 
PIiEIKTQCSGPRMPPKXCPADPAFFSFINNSDLWVANIETGEER 
RLTFCHQGLSNVIjDDPKSAGVATFVIQEEFDRFTGYWWCPTASW 
EGSEGLKTLRIIjYEEVDESEVEVIHVPSPALEERKTDSYRYPRT 

gskkpkialkijusfqtdsc^kivstoek^ 
aragwtrdgkyawamfijdrpqqwlqlvllppalfi psteneeqa 
aslcqscpqecpavcgvrgghqrldqcs 


5458 


, 6642 


4022 


fvpglrepqwepaqps atmsapseeeeyarjlvmeaqp ewlraev 

KRLSK^IAETTREKIQAAEYGIiAVIiEEKHQI^QFEEJ^E^YEA 
IRSXMEQLKEAFGQAHTNHKKVAADGE S REES L I QESASKEQ YY 
VRKVLEIX^ELKQL&NVXtTNTQ 

QRGRLRDDIKEYKFREARLI»QDYSELEEEN1SLQKQVSVLRQNQ 
VEFEGLKHE I KRLEEETEYLNS QLEDAI RLKEI S ERQLiEEALET 
LKTEREQKNSLRKELSHYMS INDS FYTSHLHVSLDGLKFSDDAA 
EPNITOAEALWGFEHGGI^AKLPLDNKTSTPKKEGIAP PSPSLVS 
DLLSEI^ISEIQKLKQQLMQMEREKAGLIATLQDTQKQIjEHTRG 
SI^EOX3EKVTRLTENI^AIjRR1>GASKERQTAL»DNEKDRDSHEDG 
DYYEVD I NG P E IIACKYHVAVAEAGELREQLKALRSTHEAREAQ 
HAEEKGRYEAEGQALTEKVSLLEKASRQDREXjLARLEKELKKVS 
DVAGETQGSLSVAQDEl^VTFSEELAl^YHHVCMCNNETPl^VKL 
DYYREGQGGAGRTSPGGRTS PEARGRRSP r LLPKGLIAPEAGRA 
DGGTGDSSPSPGSSl.PSPL£DPRREPKtri YNL1AI IRDQIKHLQ 
AAVDRTTEIiSRQRIASQEI/SPAVDKDKEAIiMEEIIiKIjKSIjLSTO 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corr e spending 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, (^Cysteine , D=Aspartic Acid, E- 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K= Lysine, 
L= Leucine, M^Methionine, N-Asparagine, 
P=Proline, Q=Glut amine, R=Arginine, 
S-Serine, T^Threonine , VWaline, 
W=Tryptophan, Y-Tyrosine, X=Unknovn, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








REQ ITT1»RTVZjKAN kotaevai^lkskyenekamvtetmmklr 

NELKALKEDAATFS SlJiAMFATRCl>EYITQIJ^^ 
KKTLNSLLRMAIQQ KLALTQRiEIJLELDHEQTRRGRAKAAPKTK 
PATPSVSHTCA CAS DRAEGTGltANQVFCS KKHS XYCD 


5459 


316 


1262 


RGGHRLSGMASNFtTOIVKQGYVRIRSRiOjGIYQRCVfLVFKKASS 

KGPKRIiEKFSDERAAYFRCYHKVTEIjNNViCW^ 

G I YFNDDTS KT FACES DLEAI>E WC KVLQMECVGTR IND I S LG E P 

DL1^TGVEREQSERFNVYI^PSP^^ J GCY^1GEC^JQITYEYICLW 

DVQNPRVKLISWPLSALRBYGRPTTWFTFEAGRMCETGEGLFJF 

QTRDGEAI YQKVHSAAIAIAEQHERLLQSVKNSMLQMKMSERAA 

SLSTMVPLPRSAYWQHITRQHSTGQLYRLQDVSSPLiCLHRrETF 

PAYRSEH 


5460 


45 


2097 

• 


RPGCRAGELSTGSRARERVRNRVSAPCGQDSRRCDPEV1.RGRSP 
GLGliAEMPS CG ACTCGAAAVRL I TSS LAS AQRG I SGGR I HMSVXi 
GRLGTFETQILQRAPLRS FTETPAYFASKDGISKDGSGDGNKKS 
ASEGSS KKSGSGNSG KGGNQLRC PKCGDI*CTHVETFVS S TRFVK 
CEKCHHFFVVLSEADSKKSIIKEPESAAEAVKLAFQQKPPPPPK 
KI YNYLDKYWGQS FAKKVXjSVAVYNHYKR I YNNIPANLRQQAE 
VEKQTSLTPRELEIRRREDEYRFTKIil^IAGISPHGKALGASMQ 
QQVNQQX PQEKRGGEVLDS SHDDI KUSKSN I IiLLGPTGSGKTLIi 
AQTLAKCLDVPFAICDCTTUTQAGYVGEDIESVIAKLLQDANYN 
VEKAQQGIVFLDEVDKI GS VPGIHQLRDVGGEGVQQGLLKLLEG 
TIVNVPEKNSRKLRGETVQVDTTNlLFVASGAFNGliDRI I SRRK 
NEKYLGFCTPSNI^KGRRAAAAADLANRSGESNTHQDI EEKDRIi 
LRHVEARDLIEFGMI PE FVGRLP WVPLHSLDEKTLVQ I IjTEPR 
NAVI PQYQALFSMDKCTLNVTEDAIjKAJCARI»ALERKTGARGrjRS 
IMBKLIJ^EPMFEVPNSDIVCVEVDKEVVEGKKEPGYIRAPTKES 
SEEEYDSGVEEEGWPRQADAANS 


5461 


14 81 

• 


160 


INPPPPPKSPCGRARKWRRRRRPGAPEAAVMELPSGPGPERLFD 
SHRLPGDCFIJ^VLIJjYAPVGFCL1>VL.RLFLfG I H VFL VS CAL PD 
S VLRRFVVRTMCAVIiGLVARQEPSGIJ^DKS VRV1.I SNHVTPFDH 
NXVNLLTTCSTPLI^SPPSFVCTSRGFMEMNGRGEbVESLKRFC 
ASTRLPPTPLLLFPEEEATNGREGLLRFSSWPFSIQDWQPLTL 
QVQRPLVSVTVSDASWVSELLWSLFVPFTVYQWWLRPVHRQLG 
EANBE FALRVQQLVAKELGOTGTRLTPADKAEHNXRQRH PRiRP 
0SAQSS FPPSPqPS PDVQLATIiAQRVXEVLPHVPbGVTQRDlAK 
TGCVDLTITNLLEGAVAFMPEDITKGTQSLPTASASKFPS SGPV 
TPQPTALTFAKSSWARQESLQERKQALYEYARRRFTERRAQEAD 


5462 


663 


3353 


KX KERQMS ANN S P P S AQ KS VL PTA I P AVLPAAS P CS S P KTG L»S A 
RLSNGS FS APS LTNS RGS VHTVS F1»LQ I GI/TRE S VT I EAQE I*S L 
SAVKDLVCS IVYQKFPECGFPGMYDKII*LFRK1>MNSEN1 LQIil T 
SADE I HEGDLVEVVLSALATVEDFQ I RPHTL YVHS YKAPT FCD Y 
CGET^WGLVRQGLKCEGCGIJ^HKKCAFKTPNNCSGVRiaiRJ^^ 

VSIjPG PGI*S vprplq PEYVALPSEESHVHQEPS KRI PS WSGRP I 
WMEJKMVMC3lVKVPHTFA^mSYTRPTICQYCKRLI.KGLFRG^MQC 
KDCKFNCHKRCASKVPRDCLGEVTFNGEPSSLGTDTDIPMDIDN 
NDINS DS S RGLDDTEEPS PPEDKMFFLDPSDLDVERDEE AVKTI 
SPSTSNNI PLMR WQS I KHTKRKS S TMVXEGWMVHY TSRDNLRK 
RHYWRLDSKCLTLFQNESGS KYYKEI PLSEILRI SSPRDFTNIS 
QGSNPHCFEIITDTMVYFVGENNGDSSHNPVLAATGVGLDVAQS 
WKKAIRQA1MPVTPQASVCTS PGQGKDHKDLSTS ISVSNCQ IQE 
NVDIS TVYQ I FADE\TLGSGQFG X VYGGKHRKTGRDVA I KVT DKM 
RFPTKQESQLRNEVAILQNLHHPGIVNLECMFETPERVFVVMEK 
IiHGI»4LEMIt£SEKSRbPERlTKFMVTQ ILVALRNLHFKNIVHC 
DLKPENVIiIiASAEPFPQVKLCDFGFARl IGEKSFRRSWGTPAY 
LAPEVLRSKGYNRSLDMWSVGVI I YVSLSGTFPFNEDEDINDQI 
QNAAFM YPPNPMREI SGEAI DblNNLLQVKMRKR YSVDKSI^SHP 
WLQDYOTWLDLREFETRIGERY ITHESDDARWEIHAYTHNLVY P 
KHFIMAPNPDDMEEDP 


5463 


237 


1012 


Ll£VTMTTSRCSHLPEVI*PDCTSSAAPVVKTVEDCGSIiVNGQPQ 
YVMQVSAiaDGQLLSTVVRTIiATQS PFNDRPMCRI CHEGS S QED1* 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
cor responding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=*Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I^isoleucine, K«I»ysine, 
L= Leucine, M=Methionine, N=Asparagine, 
P= Proline, Q=Glutamine, R=Arginine, 
S-Serine, T=Threonine , V»=Valine, 
W- Tryptophan , Y= Tyrosine, X~ Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








hS PCECTGTLX3T I HRS CL£HW1»SSSNTS YCE LCHFR FAVBRKPR 
PL VEWLRNPGPQHE KRTLFGDMVCFL F I TPLATISG WLCLRGAV 
DHliHFSSRLEAVGLIALTVALFTI VXFWTLVS FRYHCRLYNEWR 
RTNQRV1LLI PKS VNVP SNQPS LLGLKS VKRNS KETW 


5464 


195 


€77 


SPSMNPRKKVDIjKL I rVGAIGVGKTSIiHQYVHKTFyEEYQTTL 
GAS I LS KI 1 1 U3DTTLKLQ I WDTGGQERVRSMVSTFY KGSDGCI 
lAFDNTTDLESPEAIiDIVJRGDVliAKI^MEQSYPWVlaLGNKIDIiA 
DRKYQSILENHLTES IKLSPDQSRSRCC 


5465 


5278 


3348 

- 


"KGDPREFIRVHREALECDYVSAHIiHEVn'IDLIFGYKQQGPAAVEA 

VNVFHHLPYEGQVDI YNINDPLKETATIGFINNPGQI PKQ.LFKK 
PHPPKRVRSRLNGDNAGISVLPGSTSDKIFFHHLDNLRPSLTPV 

KELXEPVGQI VCTDKGILAVEQNKVLI PPTWNKTFAWGYADLS C 
RI^TYESDKAhTTVYECIiSEWGQILCAICPNPKLVITGGTSTVVC 
VWEMGTS KE KAKTVTIiKQALLGHTDTVTCATAS LAYH I IVSGSR 
DRTCIIWDlJaKLSFXTQLRGJffiAPVSALCINEI-TGDIVSCAGTY 
IHWJSINGNPIVSVNTFTGRSQQIICCCMSEMNEWDTQNVIVTG 
HS DGVVRFVRMEFLQVPETP APEPAE VIjEMQED CPEAQ I GQEAQ 
DEDS SDSEADEQS I S QDPKDTPS QP5STSHRPRAAS CRATAAKC 
TDSGSDDSRRKSDQI^IJDEKDGFIFVNYSEGQTRAHLQGPLSHP 
HPNP 1 EVRNYS RLKPG YRWERQLVFRS KLTMHTAITDRKDNAHF A 
EVTALG I S KDHSR ILVGDS RGRVFS WS VSDQPG RS AADHWVKDE 
GGDS CSGCS VRFSLTERRHHCRN CGQLF CQKCS RFQ S E I KRUCI 
S S P VRVCQNCYYNL QH ERG SE DG PRN C 


5466 


3 


992 


HACAHASAHASGRLVRWWRKRRSVMG IQTS P VLLASLGVGLVTL 
IXSLAVGSYXVRRSRRPQVTLLDPNEKYIjLRLI^KTTVSHNTKRF 
RFALPTAHHTLGLFVGKH I YLSTR IDGSLVI RP YTP VTS DEDQG 
YVDLVIKVYI*KGVHPKFPEGGKMSQYU3SI*KVGDWEFRGPSGI* 
LTYTGKGHFNIQPNKKSPPEPRVAKKLGMIAGGTGITPMLQLIR 
AILKVPEDPTQCFLLFANQTEKDI ILREDLEELQARYPNRFKLW 
FTLDHPPKDWAYS KG FVTADM I REHL ? APGDD VLVLLCG PPPMV 
QLACHPNLDKLGYSQKMRFTY 


5467 


2103 


4 


GBALRVGTRGCRRDLPDPQARI FIQKKDLEEDESVTAAHLKSRG 
RSPRKIDQFCNSSNMVHGSVTFRDVAIDFSQEEWECLQPDQRTIj 
YRDVMLEN YSHLISLAGSS I SKPDVI TLLEQBKEPWMWRKETS 
RRYPDLELKYGPEK^SPENDTSEVmjPKQVIKQISTTIiGIEAFY 
FRNDSEYRQFEGLQGYQEGNINQKM I S YEKLPTHTPHASLICNT 
HKPYECKECGXYFSCGSNLI QHQS IHTGEKPYKCXECGKAFQLH 
IQLTRHQKFHTGE KTFE CKECGKAFNLPTQLNRHKN IHTVTCKLF 
ECKECGKSFNRSSNLTQHQS IHAGVKPYQCKECGKAFNRGSNLI 
QHQKIHSNEKPFVCKEOGMAFRYHYQL I EHCQIHTGEKPFECKE 
CGKAFTLLTKLVRHQKlHTGEKPFECRECGKAFSLIiNQLNRHKN 
IHTGEKP FECKECGKSFNRS SNLVQHQS IHAGI KPYECKECGKG 
FNRGAHLIQHQKIHSNEKPFVCRECEMAFRYHCQIj I EHSRIHTG 
DKPFECQDCGKAFNRGSSLVQHQSIHTGEKPYECKECX3KAFRLY 
LQLSQHQKTHTGE KPFECKECGKFFRRGSNLNQHRS IHTGKKPF 
EGKECGKAFl?IjHMHLIRKQKLHTGEKPFEC3CECGKAFRIiHMQLI 
RHQKIjHTGEKPFECKECGKVFSLPTQLNRHKNIHTGEKAS 


5468 


225 


2976 


S FLTDL FQ S IiAQliENLiCKQ LY ETTDTTTRLQ AE KAL VElr TiS S PD 
CLSKCQLLLERG S S S YSQLLAATCLTKLVSRTNNPLP LEQRI D 1 
RNYVLKYTjATRPKIiATFVTQALIQLYAR I TKLGWFDCQKDDYVF 
RNATTDVTRFLQDSVEYCI IGVTILSQI/FN EINQVSATAFL I EA 
DTTHPLTKHRKIASSFRDSSLFDIFTI^CNLLKQASGKNLNIJTO 
ES QHG LLMQLL KLTHNCLNFD FI GTS TD ES SDVLCTVQ I PTS WR 
SAFLDSSTLQLST I GRCEYEKTCALLVQLFDQSAQS YQELLQS A 
S AS PMD IAVQEGRLT WLVY I IGAVIGGRVS FASTDEQDAMDGEL 
VCRVLQLMNLTDS RIAQAGNEKIjELAMLS FFEOPRKI YIGDQ VQ 
KSSKLYRRLSEVLGUNDETTMVLSVFIGKI ITNLKYWGRCEPITS 
KTLQLLNDLS IGYSSWKLVTCLSAVQFMLNNHTSEHFS FLGINN 
QSNI*TDMRCRTTFYTAI/5RI*LMVDLGEDEDQ YEQ FMLPLTAAFE 
AVAQMFSTNS FNEQEARRTLVGLVRDLiRG IAFAFNAKTS PMMLF 
EWI YPS YMP ILQRAI ELWYHDPACTTPVLKLMAELVHNRSQRLQ 



328 



r 



t 



WO 01/5331 2 PCT/US00/34263 



[ SEQ 
I ID 
1 NO: 


Predicted 
beginning 
nucleotide 
location 
corre spending 
to first 
amino acid 
residue of 
amino acid 
sequence 


1 Predicted end 
I nucleotide 
location 
corre spending 
to first 
amino acid 
reoidue of 
amino acid 
sequence 


I Amino acid segment containing signal peptide 

I (A=Alanine, C=Cysteine / D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 

I H=Histidine, I=Isoleucine, K= Lysine, 
L= Leu cine , M-Methionine, N=Asparagine, 
P-Proline, Q^Glut amine, R=Arginine, 
S=Serine, T=Thxeonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 

1 \=possible nucleotide insertion) 








FDVSSPNGILLFRETSKMITMYGNRILTLGEVPKDQVYALKLKG 
IS ICFSMLKAALSGSYVNFGVFRLYGDDAiDNALQTFI KLKhS 1 
PHSDLIJDYPKI^QSYYSLIJSVLTQDHMNFIASLEPHVIMYILSS 
ISEGLTALDTMVCTCCCSCTiDHIVTYLF^^ 

ESDRFI^IMQ^HPEMIQQMIiSTVIaNIIIFEDCRNQWSMSRPLI^ 
ULIiNEKYFSDIjIWSIWSQPPEKQQAMHLCFENLMEGIERNLL 
TKNRDRFTQNLSAFRREVNDSMKNSTYGVNSNDMMS 


5469 


134 


2653 


E^EFETSLVPWHLPMGWLCSGl^FPVSCLVLLQVASSGNMKVLQ 
EPTCVSDYMS ISTCEWKMNGPTNCSTELRLLYQtiVFLLSEAHTC 
VPENNGGAG CV CHXJjMDD WS ADNYTLDL WAG QQ LLWKG S F KPS 
1 EHVKPRAPGNItTVHTNVSDTIJJjTWSNPYPPDNYLYN^TYAVN 
IWSF^DPADFRIY^WYLEPSLRIAASTLKSGISYRARVRAWAQ 
CYNTTWSEWSPSTKWHNSYREPFEQHLJ^LGVSVSCIVILAVCLL 
CYVS ITKIKKEWWDQI PNPARSRLVAII IQDAQGSQWEKRS RGQ 
EPAKCPHWKNC1.TKLI»PCFLEHNWKRDEDPHKAAKEMPFQGSGK 
SAWCPVElSKTVIiWPESISWRCVEl*FEAPVECESEEEVEEEKG 
S FCAS P ESSR DDFOEGREG I VARLTES LFLDLLTCRRNfifi Pf onn 

MGESCUjPPSGSTSAHMPWDEFPSAGPKEAPPWGKEQPLHLEPS 
PPAS PTQSPDNI.TCTETPLVIAGNPAYRSPSNSLSQSPCPRELG 
PDPLLARKLEF^/EPEMPCVPQLSEPTTVFQPEPETWEQILRRNV 
LQHGAAAAPVSAPTSGYQEFVHAVEQGGTQASAVVGLGPPGEAG 
1 YKAFSSLLAS SAVSPEKCGFGAS SGEEGYKPFQD^I PGCPGDPA 
, PVPVPLFTFGI>DREPPRSPQSSHLPSSSPEHLG1.EPGEKVEDMP 
KPPLPQEQATDPLVDSLCSGIVYSAiTCHLCGItliKQCHGQEDGG 
QTPVMASPCCGCCC33DRASPPTTPI»RAPDPSPGGVPIiEASLCPA 
SIiAPSGISEKSKSSSSFHPAPGNAQSSSQTPKIVNFVSVGPTYM 
RVS 


j 5470 

J 


17 


1418 1 


TACRI RTSLNRG I AAVKRDAVEMLAS YGLA YSIiMKFFTG PMS D P* 
KNVGliVFVNS KRDRTKAVLCMWAGAIAAVFHTL IAYSDLGYYI 
I NKLHHVDES VG S KTR RA FL YLAAF P FMD AMAWTHAG I LLKHKY 
S FL VG CAS I S D VI AQ WFVAI LLHS HL ECRE PLL I P I LS L YMG A 
LVRCTTLCLG YYKNIHDI I PDRSGPELGGDATI RXMLSFWWPLA 
LILATQRISRPIVNLFVSRDLGGSSAATEAVAI LTATYFVGHM P 
YGWLTEIRAVYPAFDKNNPSNKLVSTSNTVTAAHIKKFTFVa^A 
IiSLTLCFVMFVTTPNVSEKILIDI IGVDFAFAELCVVPLRI PS FF 
P VP VTVRAHLTGW LMTLKKT FVLAP S S VLRI IVLIASLWLPYL 
GVHGATLGVGSLLAGFVGESTMDAIAACYVYRKQKXKMENESAT 
EGEDSAMTDMPPTEEVTDI VEMREENE 


! 5471 


1868 


658 


RS SAP PG PQRAAAATAAAAAAG VEMAAAAAQGGGGGE PRRTEGV 
GPGVPGEVEMVKGQPFDVGPRYTQLQYIGEGAYGMVSSAYDHVR 
KTRVAIKKlSPFEIIQTYCQRTX»REIQILIJiFRHENVIGIRDILR 
ASTIiBAMRDVY I VQDLMETJDL YKLLKSQQLSNDH I C YFTiYQ I LR 
GLKYIHSAmTLHRDLKPSNLLINTTCDLKICDFGLARIADPEHD 
HTGFLTE YVATRW YRAPE I MLNS KGYTKS IDIWSVG C ILAEMLS 
NRPIFPGKHYLPQI^II/3II/5SPSQEDLWCII^KARNYLQSI. 
PS KTKVAWAKLFP KSDS KALDLLDRMLT FNPNKR I TVEEALAH P 

YLEQYYDPTDEPVAEEPFTFAMELDDLPKERLKELIFQETARFO 
PGVLEAP 


5472 


1469 ) 


753 


li YVMAR YLSDEE VAVS IDRLCKANGRS PS IPFGTVRI PGRARVR 
DPQALW I FGYGSLVWRPDFAYSDSRVG FVRGYSRRFWQGDTFKR 
GSD KMPGRVVTLLEDHEGCTWGVAYQVQG EQVS KALKYLNVREA 
VLGGYDTKBV^FTP<JDAPDQPLKAIjAYVATPQNPGYLGPAPEEA 
IATQIIACRGFSGHNLEYLI1RVRDVMQLCGPQAQDEHI1AAIVDA 
VGTMLPCFC PTEQALALV 


54 73 


3 j 


2119 


FMNVKIiL I QDLED I EQRVP VMDAQ YKI I TKTAHLITKESPQEEG 
KEMFATMSKLKEQiTKVKECYSPLLYESQQLLI PLBELEKQMTS 
FYDSLGKINEI ITVLEREAQSSALFKQKHQELLACQENCKKTL.T 
LIEKGSQSVQKFVTLSNVLKHFDQTRIjQRQIADIHVAFQSMVK^ 
TGDWKKHVETNSRLMKKFEESRAELBKVIJilAQE 
LLRRHTEFFSQLDQRVLNAFLKACDELTDI LPEQEQQGLQEAVR 
KLHKQW KDLQGEAP YHLIiHLKI DVE KNRFIAS AEECRTEIjDRET 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corre spondi ng 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A- Alanine, C=Cystaine, D=Aspartic Acid, E=» 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
H-Histidine, I^lsoleucine, K- Lysine, 
L= Leucine, M=Methionine, N=Asparagine, 
P=Prol ine , Q-Glutamine , R=Arginine , 
S»Serine, T=Thr eon ine , V=Valine, 
W-Tryptophan, Y=Tyrosine, X=Un known, *=Stop 
Codon, /^possible nucleotide deletion, 
nucicowiuc insei liuh / 


- 






KI*IPQEGSEKriKEHRVF?SDKGPHHr^KRIiQLIEEI*CVKLPV 

RDPVRDTPGTCHVTIJtEI*RAAIDSTYRKIME^ 

EFSSH I STNETQLKG I KGEAXDTANHGEVKRAVE E IRNGVTKRG 

ETLSWLKSRLK^TEVSSENEAQKCGDELAICbSSSFKALVTLI^ 

EVEKMLSNFGDCVQYKEIVKNSLEELI SGSKEVQEQABKILDTB 

HBELRKLESTLJX3LERSRERQEPJiIQVTLRKWERFETNKET\A7R 
YLFQTGSSHERFLSFSSLESLSSELEQTKEFSKRTESXAVQAEN 
LVKEAS EI PLG PQNKQ LLQQQAKS I KEQVKKLEDTLEEE YVIDK 

s 


5474 


2 


780 


TPDVRQLQASRRGIAVASWCSPRWFAGEEMAFVKS3WLLRQSTI 
IjKRWKKNWFDLWSDGHLIYYDIXJTRQNIEDKVHMPMDCINIRTG 
QECRDTQPPDGKS KDCMLQ 1 VCRDG KT 1 SLCAESTDDCLAWKFT 
LQDSRTNTAVVGSAVMTDET3WS S PPP YTAYAAPAPEVGRTLS 
LQXJAYG YGP YGGAYP PGTQ VVYAANGQAYAVP YQ Y PYAGL YGQQ 
PANQVI I RER YRDNDSDLALGMLAGAATGMALG SL FWVF 


5475 


2 


506 


ARGWLESLSLTCQTTPPPSS PCLLHS PETF I HTMP PNLTG YYRF 
VSQKNMEDYLQALNISIAVRKIALLLKPDKE I EHQGNHMTVRTL 
STFRWYTVQFDVGVEFEEDLRSVDGRKCQTIVTWEEEHLVCVQK 
GEVPNRGWRHWliEGEMLYLELTARDAVCEQVFRKVR 


5476 


192 


1457 


SDSMSLLDCFCTSRTQVESLRPEKQSETSIHQYLVDEPTLSWSR 
PSTRASEVLCSTNVSHYELQVEIGRGFDNLTSVHLARHTPTGTL 
VTI KI TNLENCNEERLKALQKAVILS HF FRHPN I TT YWTVFTVG 
S WLWVI S PFMAYGSASQLLRTYFPEGMS ETLIRNI ZiFGAVRGLN 
YLHQNGCIHRS I KASHILI SGDGLVTLS GLSHLHSLVKHGQRHR 
AVYDFPQFSTS VQPWLS PE LLRQDLHGYNVKSD I YS VGITACEL 
ASGQVPFQDiyiHRTQMLLQKLKGPPYSPLDISIFPQSESRMKNSQ 
SGVDSGIGF^VLVSSGTHTVNSDRXHTPSSKTFSPAJ?FSLVQLC 
LQQDPEKRPSASSLLSHVFFKQMKEESQDSILSLLPPAYNKPS I 
S LPP VLPWTE P ECD FPDEKDS YWEF 


54 77 


3 


1044 


RGNSRLRYSHEDELQLPRLPELFBTGRQLLDEVEVATEPAGSRI 
VQEKVFKGLDLLEKAAEMLSQIjDLFSRWEDLEEIASTDLKYLLV 
PAFQGALTMKQVNPSKRLDHLQRAREHF IN YLTQ CHCYHVAE F3 
LPKTMNNS AENHTANSSMAYPS LVAMASQRQAKI QRYKQKKBLE 
KjILSAMKSAVESGQADDERVREYYLIjHLQR 

E I KILRERD S SREASTSNS SRQERPPVKP F I LTRNMAQAKVFGA 
GY PSLPTMTVSDWYEQHRKYGALPDQG iakaapeefrkaaqqqe 
EQEEKEEEDDEQTLHRAREWDDWKDTH PRG YGNRQNMG 


5478 


2 


835 


KTVRI WVPNVKGESTVFRAHTATVRSVHFCSDGQS FVTASDDKT 
VKVWATHRQKFLFSLSQHINWVRCAKFSPDGRLIVSASDDXTVK 
LWDKSSRE CVHS YCEHGGFVTYVDFHPSGTCIAAAGMDNTVKVW 

MEGRLLYTLHGH0GPATTVAFSRTGEYFASGGSDEQVMVV7KSNF 
DIGDHGEVTKVPRPPATLAS SMGNLTVS ILEQRLTLEEDKLKQC 
LENQQLIMQRATP 


5479 


2 


835 


KTVRXWVPNVKGESTVFRAHTATVRS VKFCSDGQS FVTASDDKT 
VKVNATHRQKFIiFSLSQHINWVRCAKFS PDGRL I VSASDDKTVK 
L1TOKSSRECVHS YCEHGGFVT YVDFHPS GTCI AAAGMDNTVKVW 
DVRTHRLLQHYQLHS AAVNGLS FHPSGNYL I TASSDSTLKILDL 
MEGRlJliYTLHGHOGPATTVAFSRTGEYFASGGSDEQVMWKSNF 
D I GDHGEVT KVPR P PATLASSMGHnjTVS I LEQRLTLEEDKLKQC 
LENQQLIMQRATP 


5480 


444 


1952 


LSLTSRMEEAEIjVKGRIjQAITDKRKIQEEISQKRLKIEEDKLKH 
QHLKKKALREKWIjLDG IS SGKBQEEMKKQNQQDQHQ I QVLEQS I i 
LRI^KEIQDLEKAELQISTKEEAILKKLKS I ERTTEDI IRSVKV 
EREERAEES IEDIYANIPDLPKSYIPSRLRKEINEEKEDDEQNR 
RAL YAMS I KVEKDLKTGES TVLSS I PLPSDDFKGTG I KYYDDGQ 
KSVYAVSSNHSAAYNGTTCLAPVEVEELLRQASERNSKSPTEYH 
EPVYANPFYR PTTPQRETVTPGPNFOERI KI KTKGLG I GVNES I 
HNMGNGLSEERGNNFNHI SPIPPVPHPRSVIQQAEEKLHTPQKR 
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SEQ f Predicted 
ID I beginning 
NO: I nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G^=Glycine, 
H-Histidine, I«Isoleucine, K= Lysine, 
L= Leucine, M-Methionine, Ns.Aspaxagine, 
P=Pxol ine , Q=Glutamine , R=Arginine , 
S=Serine, T= Threonine, V=Veline, 
W-Tryptophan, Y= Tyrosine, X=Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\=posslble nucleotide insertion) 



LMTPWEE SNVMQDKDAPS PKPRLS PRBTI FGKSEHQNSSPTCQB 
DEBDVRYKrVKSLPPDINDTBPVTMIPMGYQQAEDSBEDKKFLT 
GYDGI IHAELWIDDEEEEDEGEAEKPSYHPIAPHSQVYQPAKP 
TPLPRKRSEAS PHEKHKS 



5481 



5482 



1492 



14 22 i NSPGSVCLCOCVCPSLLHCLPPLLLI^I^PIiLLHESPQPPALRV 

VATSSDRNFMNKKQKPVLTGQRFKTRKRDEKEKFE PTVFRDTLV 
QGL2sIEAGDDLEAVAKFTJDSTGSRI^YPJ^yAiyrLFDI 

YLEKAFEDEMKKLLLPljKAPSErrEQTKLAMI^GILI^GTLPAT 
ILTSLFTDSLVKEG IAAS FAVKLFKAMMABKDANSVTSSLRKAN 
LD}«LLELFPVNRQSVDHFAKYFTDAGLKEr*SDFLRVOQSLGTR 
KELQKELQERLSOECPI KBWTjYVKEEMKRNDIjPETAV I GLLWT 
C I MNAVE WNKKEE LVAEQAL KHL KQ Y APLLAVFSS QG QS EL I LL 
QKVQCTCTDNI HFMXAFQKTVVL FYKADVLSEEAI LKWYKEAHV 
AKGKSVFLDQMKKFVEWLQNAE EESESEGEEN 
52 8 \ THWMTGMCYA PHQVLS Y I NG VTTS KPG VS LV YSKPS RNLS LRL 

EGLQEKDSGPYSCSVNVQDKQGKSRGHS I KTLELNVL VP PAP PS 
CRLQGVPHVGANVTLS CQSPRSKPAVQYQWDRQLFS FQTFFAPA 
LDVXRGSLSLTNLS SS MAGVYVCKAHNEVGTAQCNVTLEVSTGP 
GAAWAGAWGTLVGLGL1AGLVLLYHRRGKALEEPAND I KEDA 
IAPRTLPWPKSSDTI S KNGTLSSVTS ARALRPPHGP PRPGALTP 
TPSLSSQALPS PRLPTTDGAHPQP IS PI PGGVSS SG LSRMGAVP 
VMVP AQSQAGS LV 



S483 



788 



FFFFKGCRAGRGNESDYRKLEEMHQRFLVSERSKDDLQLRLTRA 
ENRI KQLETDS SEEISRYQEMIQKLQNVLESERENCGLVSEQRL 
KliQQENKQLRKETESLRKIALEAQKKAKVKI STMEHEFS I KERG 
FEVQLREMEDSNRNS I VELRHLLATQQ KAANR WKEETKKLTESA 
R 1 RT NN LKS ELS RQKIjHTQELLSQLEMANEKVAENE KL I LEHQB 
KANRLQRRLS QAEERAAS ASQQLS VI TVQRR KAASLMNLENI 



5484 



1997 



IMADMEDLFGSDADSEAERKDSDSGSDSDSDQENAASGSNASGS 
ESDQDERGDSGQPSNKELFGDDSEDEGASHHSGSDNHSERSDNR 
SEASERSDHEDNDPSDVDQHSGSEAPNDDEDEGHRSDGGSHHSE 
AEGSEKAHSDDEKWGREDKSDQSDDEKIQNSDDEERAQGSDEDK 
LQNSDDDEKMQNTDDEERPQLSDDERQQLSEEEKANSDDERPVA 
SDNDDEKQNSDDEEQPQLSDEEKMQNSDDERPQASDEEHRHSDD 
EEEQDHKSESARGSDSEDEVLRMKR KNAITiSDSEADSDTEVPKD 
NSGTMDLFGGADDI SSGSDGEDKPPTPGQPVDENGLPQDQQEEE 
PIPETRIEVE I PKVNTDLGNDLYFVKLPNFIiSVEPRPFDPQYYE 
DEFEDEEMLDEEGRTRLK1JCVENTIRWRIRRDEEGNEI KESNAR 
I VKWS DGSMS LHLGNEVFDVYKAPLQGDHNHLF I RQGTGLQG QA 
VFKTKLTFRPHSTOSATHRXMTLSIiAlJRCSKTQKIRILPMAGRD 
PECQRTEMIKKEBERLRAS I RRESQQRRMREKQHQRGLSAS YLE 
PDRYDEEEEGEESISLAAIKNRYKGGIREERARIYSSDSDEGSE 
EDKAQRIJjKAKKLTSDE VRPNLFNSRGLS CTQE PTALNEELTDQ 
AGTN 



5485 



161 



5486 



1404 



1074 | KRKI LSSMMDSE^ElCRFP ILTSSKQDI SPH ITNVGEMKHYLCG 

CCAAFNNVAITFP IQKVLFRQQLYGI KTRDAILQLRRDGFRNLY 
RGILPPLWK3KTTTLALMFGLYEDLS CLLHKHVS APE FATSGVAA 
VLAGTTEAI FT P LE RVQTLLQDHKHHD KFTNTYQ AFKALKCHG I 
GEYYRGLVPILFRNGLSNVLFFGLRGP I KEHLPTATTHSAHLVN 
DFICGGLLGAMLGFLFFPINVVKTRIQSQIGGEFQSFPKVFQKI 
WLERDRKLINLFRGAHLNYHRSLISWGI INATYBFLLKVI 
142 | IPGSTISWSPAAARGLSVCRCCRLHPASAMDIiFGDLPEPERSPR 

PAAGKEAQKGPLLFDDLPPASSTDSGSGGPLLFDDLPPASSGDS 
GSIATSISQMVKTEGKGAKRKTSEEEKNGSEELVEKJCVCKASSV 
I FGLKGYVAERKGEREEMQDAHVILNDITEECRPPSS LITRVS Y 
FAVFDGHGGIRAS KFAAQNLHQNLIRKFP KGDVI S VEKTVKRCL 
LDTFKHTDEEFLKQASSQKPAWKI5GSTATCVLAVDNILYIANLG 
DSRAILCRYITEBSQKHAAIjSI>SKEHNPTQYEERMRIQKAGGNVR 
DGRVLGVLEVSRSIGDGQYKRCGVTSVPD I RRCQLTPNDRFI LL 
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Predict t 
beginning 
nucleotide 

location 
corresponding 

to first 
amino acid 
residue of 
amino acid 
sequence 



Predicte( 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



Amino^acid segment containing signal peptide 
(A=Alanine, OrCysteine, D=Aspartic Acid, B= 
Glutamic Acid, F=Phenyl alanine, G=Glycine, 
H^Histidine, I^Isoleucine, K=I*ysine, 
l*=l»eucine, M=Methionine, NaAsparagine, 
P= Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V-Valine, 
W«Tryptophan, Y=Tyrosine, X= Unknown, *=Stop 
Codon, /=possible nucleotide deletion. 

\gpossibl e nucleotide insertion) 

ACDGLFKVFTPEEAVNFI LSCLEDEKIQTRBGICSAADARYEAAC 
NRLANKAVQRGSADNVrVMVVRIGH 



5487 



535 



5486 



1072 



253 



5489 



81 



893 



5490 



81 



5491 



204 



1194 



5492 



1896 



5493 



1876 



AVSLEQIRGWTPAPVPLPDQPCPSNCDMERVTLALIiLLAGL.TA 
LEANDPFANKDDPFYYDW KNhQLS GL I CGGIiLAI AG IAAVLSGK 

CKCKS SQKQ HS PVPEKAI PLITPGSATTC 

"AMAASGEPQRQWQEEVAAVVVA^GSCMTDLVSLiTSRLPKTGETIH 
GHKFFTGFGG KGANQCVQAARLGAMTSMVCKVGKDS FGNDYIEN 
LKQNDI STE FTYQTKDAATGTAS 1 I VNNEGQNI IVTVAGANLLL 
NTEDLRAAANVI SRAKVMVCQLEITPATSI.EALTMARRSGVKTIi 
FNPAPAIADLDPQFYTLSDVFCCNESEAE I LTGLTVGSAADAGE 
AALVLLKRG CQWI ITLGASGCWLSQTE PEP KH I PTEKVKAVD 

TTVSFKI . 

GKGPVAAFIDQSNIFLTDPXIFLGOWREEPKMPIiLLGETEPLK 

LERDCRSPVE PVJAAAS PDLALACLCHCQDLS SGAFPNRGVLGGV 
LFPTVEMVI KVFVATS SG S IAIRKKQQEWGFLEAMKIDFKEIiD 
IAGDEDNRRWMRENVPGEKKPQNGI PLPPQI FNEEQYCGDFDSF 
FSAKEENIIYSFLGLAPPPDSKGSEKASEGGETEAQKEGSEDVG 
NLPEAQEKNEEEGETATEET3BIAMEGAEGEAEEEEETAEGEEP 

GEDEDS : 

GKGPVAAFIDQSNIFLTDPKIFI^WREEPKMPLliLLGETEPLJC 

LERDCRS PVEPWAAASPDIiAIACLCHCQDLSSGAFPNRGVLGGV 
LFPTVEMVI KVFVATSSGS XAIRKKQQE WGFLEANKIDFKEIiD 
IAGDEDNRRWMRENVPGEKKPQNGI PI*PPQIFNEEQYCGDFDS F 
FSAKEENIIYSFLGIAPPPDSKGSEKAEEGGETEAQKEGSEDVG 
NliPBAQEKNEEEGETATEETEEIAMEGAEGEAEEEEETAEGEEP 

GEDEDS 

GSAPRIjSLG PTGAQARDPDWWARPPSRPYTQSKHDRPDTEGRSE 

QGDMAS S FLP AG AI TGDSGGELSSGDDS GEVEFPHS PE IEETS C 
IAELFEKAAAHLQGDIQVASREQLiYLYARYKQVKVGNCNTPKP 
SFFDFEGKQKWBAWJCALGDSSPSQAMQEYrAVVKKI^PGWKPQI 
PEKKGKEANTGFGGPVISSLYHEETIREEDKNI FDYCRENNIDH 
ITKAIKS KNVDWVKDEEGRAI^HWACDRGHKELVTVTitiQHRAD 
INCQDNEGQTALHYASACEFLDIVELLLQSGADPTLRDQDGCIiP 

EEVTG CKTVSL VLQRHTTGKA 

ASKNPLSAVCTTGIMSSLAVRPPAMDRS1.RSVFVGNIPYEATEE 
QLKDIFSEVGSWSFRLVYDRETGKPKGYGFCEYQDQETALSAM 
RNUSGREFSGRALRVDNAASEKNKEELKSLGPAAPIIDSPYGDP 
IDPEDAPES I TRAVASIjPPEQMFELMKQMKLCVQNSHQEARNMIj 
LQNPQIiAYAL LQAQWMRIMDPE IAIjKILHRKIHVTPL I PGKSQ 
SVSVSGPGPGPGPGLCPGPl^I^QQNPPAPQPQ^^ARRPVKDI 

PPLMQTP IQGGI PAPGP I PAAV PGAGPGSI/TPGGAMQ PQLGMPG 
VGPVPLERGQVQMSDPRAPIPRGPVTPGGLPPRGLLGDAPNDPR 
GGTLLSVTGEVEPRGYIX5PPHQGPPMHHASGHDTRGPSSHEMRG 
GPl^DPPXLIGEPRGPMIDQRGLPMDGRGGRDSRAMETRAMETE 
VLETRVMERRGMETCAMETRGMEARGf4DARGLEMRGPVPS SRGP 
MTGGIQGPGP INIGAGGP PQG'pRQVPGI SGVGNPGAGMQGTO I Q 
GTGMQGAG I QGGGMQGAG 1 QGVS I QGGG I QGGGIQGAS KQGGSQ 
PSS FSPGQSQVTPQDQEKAALI MQ VI*QI*TADQ IAMIiP PEQRQS I 

LILKEQIQKSTGAS 

RAPMMTKAVPEEPRKPGRLTQAI2^SPLTWEHVHICVPGGTPE)CI> 

TDTFRVKRPHXRRSASNGHVK3TPVYREKEDMYDEI IELKKSLH 
VQKSDVDLMRTKLRRI»EEENSRIO>RQIEQIiLDPSRGTDFVRTLA 
EKRPDASWVINGLKQRIIiKLEQQCKEKDGTI SKUQTDMKTTNLE 

EMRXAMETYYEEVHRIiQ'rZiLAS S ETTG KKPIjGEKKTGAKRQ KKM 
GSAI^LSRSVQELTEENQSUCEDLDRVLSTSPTISKTQGYVEW 
SKPRIJ^IVELEKKLSVtffiSSKSHAAEPVRSHPPACLASSSAL 
HRQPRGDRNKDHERLRGAVRDLKEERTALQEQIiQRDLEVKQLL 
QAKADLEKELECAREGEEERREREEVLREEIQTLTSKLQELQEM 
KKEEKEDCPEVPHKAQELPAPTPSSRHCEQDWPPDSSBEGLPRP 
RS P CSPGRRl>AAARVLQAQWKVYlGiKKKKAV7^DEAAVVIiQAAFR 
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S2Q 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


I Predicted end 
1 nucleotide 
1 location 
[ corresponding 

to first 
1 amino acid 
1 residue of 
1 amino acid 
1 sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C^Cysteine, D^Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine , G^Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L-Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R^Arginine, 
S=Serine, T=Threonine, V=i Valine, 
W=Tryptophan, Y= Tyrosine, X= Unknown, *=Stcp 
Codon, /=possible nucleotide deletion. 
\=possible nucleotide insertion) 










GHLTRTKLliAS KAHGSEPPSVPGLPDQSSP VPR VPSP IAQATGS 
PVQEEAIVI IQSALRAHLARARHSATGKRTTTAASTRRRSASAT 
HGDAS S P PFLAALP DPS P SG PQAVAPLPGDD VNSDDSDD I V1AP 
SLPTKNFPV 


5494 


71 




536 


RS KAKIGTP TREVPSTDMXVRRESSS SXrTHRPAPS PATPRLLGT 
RRVUjGVS EGTGCADAMELVLVFLCS LLAPmnLASAAEKEKEKD 
PFHYDYC2Tl.RIGGLVFAWLFSVGILraILSRRCKCSFNQKPRAP 
| GDEEAQVENI>ITANATEPQKAEN 


5495 


273 

t 

i ! 

r 

- 

I 


j 2168 


DSLLLIQVDTMPFTIoHIJlSRLPSAIRSLILQiCKPNIRNTSSMAG 
ELRPASLWI,PRSlAPAP^RFCQVNTGPIiPIiIX3QSEPEKWMLPP 
QGAISETRMGHPOFWKYEFGACTGSLASlrEQySEQLKDMVAFFL 
GCS FSIiEEALEKAGLPRRDPAGHSQAGAYKTTVPCVTHAGFCCP 
LWTMRP I PKDKIjEGLVRACCSIjGGEQGQPVHMGDPEIiIjG I KEL 
SKPAYGDAWCPPGEVPVFWPSPLTSLGAVSSCETPLAFAS IPG 

1 CTVMTDLKDAKAPPGCLTPERIPEVHHISQDPLHYSIASVSASQ 
KIRELESMIGIDPGNRGIGmiLCKDEBlJCASLSLSHARSVLITT 
GFPTHFNHEPPEETDGPPGAVALVAFLQAI*EKEVAJIVDQRAWN 
LHQ K I VEDAVEQG VLKTQ I P ILTYQGG S VEAAQ AF LCKNG D PQT 
PRFDKIjVAIERAGRAAIXSNYYNARKMNIKHLVDPIDDLFLAAKK 
IVGISS TGVGIX3GNEIX3MGKVKEAVRRHIRHGDVI ACDVEADFA 
VlAGVSNWGGYALACAXiYILYSCAVHSQYLRKAVGPSRAPGDQA j 
WTQALPSV^KBEKMLGILVQHKVRSGVSGIVGMEVDGLPFHNTH 

1 AEMIQKI*VDVTTAQV 


5496 


3 

i 

1 ' 

i ! 
p J 

t 

i | 


2408 


QDTKMHEI YKGNITPQLNKNTLKTSAATDVWAVYFSQFW IDYSG 
MKSGKGRPISPVDSFPIjS1WICX3PTRYAESQKEPQTCNQVSIjNT 
SQSBSSDI^RLKRKKLIiKEYYSTESEPLTNGGQKPSSSDTFFR 
FSPSSSEADIHLLVHVHKHVSMQINHYQYLlJ J IiFI J 31ESLILliSE 

NLRKDVEAVTGSPASQTS icigillrsaelalllhpvdqantlk 

SPVSESVSPWPDYIiPTBNGDFIiSSKRXQISRDINRIRSVTVNH 
MSDKRSMS VDLS H I PLKDPLLFKSAS DTNLQKG I S FMDYLS DKH 
LGKISEDESSGLVYKSGSGEIGSETSDKKDSFYTDSSSVLNYRS 
DSNII^FlJSrX^QNILSSTLTSKGNBTIESIFKAEDLL.PEAASL 
SENIJDISKEE^PPVRTIJCSQSSLSGKPKERCPPNIAPLCVSYKN 
MKRS S S QMS LDTI SLD SMI LEEQLLE SDGSDSHMFLEKGNKKNS 
TTNYRGTAES VNAGANLQNYGETS PDAI S TNSEGAQENHDDLMS 
VWFKITGVNGE IDIRGEDTE ICLQVNQVl'PDQLGNI SLRH YLC 
NRPVGSDQKAVIHSKSSPE I SIiRFESGPGAVIHSI*LAEKNGFLQ 
CHIKWFSTEFLTSSl^IQHFLEDETVATVWPMKIQVSOTKI^ 
KBDSPRSSTVSLEPAPVTVHIDHI>V\mRSDDGSFHIRDSHMLNT 
GNDLKENVKSDSVLLTSGKYDI.KKQRSVTQATQTSPGVPWPSQS 
ANFPEFS FI?FT!P^QI/^ENSSI^QEZiAlCAKMAIjAEAHLEKDALL 
HHIKKMTVE 


5497 


1821 , | 


3308 j 


SIS KLLKRRS N I DAYLLS NS CAFFAPR I*FS LASQ 1 1 REQQS PNV 
CFIYKYSGFPSLECQCHFVSPHSSCY1NFFSFPPPFFVCFQLSN 
GFSHYSLSSESHVGPTGAGLFPHCLPASRLLPRVTSVHLPDYAH 
YYTIGPGMFPSSQI PSWKDWAKPGPYDQPLVNTLQRRKEKREPD 
PNGGGPTTASGPPAAAEEAQRPRSMTVSAATRPGEEMEACEELA 
IJUjSRGI^LDTQRSSRDSIiQCSSGYSTQTTTPCCSEDTIPSOVS 
D YDYFS VSGDQEADQQEFDKS ST I PRN S D I SQS YRRM FQAKRPA 
STAGLPTTLGPAMVTPGVATIRRTPSTKPSVRRGTIGAGPIPIK 
TPVI PVKTPTVPDLPGVLPAPPDGPEERGEHS PES PSVGEGPQG 
VTSMPSSP1WSGQASVNPPLPGPKPSI PHEHRQAI PBSEAEDQER 
EPPSATVS PGQI PESDPADLSPROTPQGEDMLNAIRRGVKLKKT 
TTNDRSAPRFS 


5498 


2434 1 


1492 


ILTOQEIFTGEXPCECGKASIQMSHLSQQKIYSGENPFACKVCG 
KVTSHKSNLTEHEHFHTREKPFECNEOGKAFSQKQYVIKHQITrH 
TGEKLFEC^ECGKSFSQKENLLTHQKTHTGEKPFECKDCGKAFI 
QKSNLIRHQRTHTGEKPFVC3CECGKTFSGKSNLTEHEKIHIGEK 
PFKCS ECGTAFGQKKYLIKHQNIHTGEKPYECNECGKAFSQRTS 
LIVHVRJHSGDKPYECNVCGKAFSQSSSLTVHVRSHTGEKPYGC 
NECCKAFSQFSTLALHIiRIHTGKRPYQCSECGKAFSQKSHHIRH 
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SEQ 
ID 

NO: 


Predicted 

beginning 

nucleotide 

location 

c oirre spending 

to tirsc 

amino acid 

residue of 

amino acid 

sequence 


Predicted end 
nucleotide 
location 
corre spondi ng 
to first 

ann.no acia 
residue of 
amino acid 
sequence 


Amxno acid segment containing signal pepciae 
(A=Alanine , C= Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G^Glycine, 
H=Histidine, I=IsoIeucine, K«=Lysxne, 
L=Leucine, M=Methionine, N=»Asparagine , 

rx— P »-<-»l ■» O — Rlntatninp I? = At*ct "V t*v i n & 

S^Serine, T= Threonine, V=Valine J 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








QKIHTH 


c a no 

54 99 


*j o /i 




GFRO TG RGH K I ^TY PFS PR K <5G RKGMAQSOG WVKRY I KAFCKG F 
FVAVTVAWFIJDRVACVARVEGASWPSI^PGGSQSSDVVIJ^OT 
WKVRNPEVHRGD I VSIiVS P KNPEQKI I KRVIALEGDIVRT IGHK 
NR YVKVPRGH I WVEGDHHGHS FDSNS FGPVS LGLLHAHATH I LW 
PPERWQKLESVLP PERiPVQREEE 


5500 


1978 


1286 


KPDWRIjQNIjPPP^YLWRSSRFGFGHLKKRLQMDFKIEHTWDGFP 

DYEWEAFFLND I TEQYLEVBl^PHGQHLVLLLSGRRNVVJKQEL. 
PI^FRVSRGETKWEGKAYLPWSYFPPNVTKFNSFAIHGSKDKRS 
YEALYPVPQHBLQQGQKPDFHCLEYFKS FNFNTLLGEEW kq p ss 
DLWLIEKCDI 


5501 


2927 


2226 


CRPPVSARVAPGHQGAVGGSGRRPARVEWDAAARPSSRPFS1»P 
AAIMI^ISRLIJDWFR^LFWKEEMELTLVGI^YSGKTTFVNVIA 
SUyroEUMl P 1 vbrNrlKKV 1 :\Aj1NV 1 i. r».lVNUJ.(orLiy Jrtcr KorlWJjKl 
CRGVNAI VYMIDAADREK1 EASRNELHNLLDKPQLQG I PVLVLG 
NKRDLPNALDE KQI»I EKMN1»SAIQDREI CCYS I S CKEKDNID I T 
LQWIiIQHSKSRRS 


5502 


3 


624 


N5AFP VW VPERTAJL»IjTCPIjGAAFv3^ lAMo KJj 
GKFFKGGGSSKSRAAPSPQELALVRLRETEEMLGKKQEYIjEIIRIQ 
REIAIAKKHGTQNKRAALQALKRKKRFE KQLiTQIDGTl*ST IEFQ 
REAiENSKTNTEVI 1 RN^^FAAKA^^<SVHENMDLN'KI DDLMQE IT 
EQQDIAQEISEAFSQRVGFGDDFDEDEIiMAELEEIjEQEELNKKM 
TNIPIiPITWSSSIjPAQPNPJCPGMSSTARRSRAASSQRAEEEDDD 

ikqlaawat 


5503 


216 . 


654 


KGVRRRGRVRSDSEDSHLGYFKMSFIjI*PKLTSKKEVDQA1KSTA 
EKVLVLRFGRDEDPVCLQLDDI LSKTSSDLSKMAAI YLVDVDQT 
AVYTQYFDIS YI PSTVFFFNGQHMKVDYGGEDPALRSIKAVRRT 
SPAGTLGEKPVKS 


5504 


58 


3563 


QLSFSFQAPVTFDDITVYLI^EEWVLLSQO^KELCGSNKLVAPI, 
GPTVANPEbFRKFGRGPEPWlk;SVOGQRSLIiEHHPGKKX^GY^ 
' EMEVQGPTRESGQSLP PQKKAYLSHIjSTGSGHI EGD WAGRNRKI* 
LKPRSIQKSWFVQFPWLIMNEEQTALFCSACREYPSIRDKRSRI* 
IEGYTGPFKVETLKYHAKSKAHMFCVNAIAARDPIWAARFRSIR 
DPPGDVLAS PEPLFTADCP I FYPPGPLGGFDSMAELI.PSSRAEL 
EDPGGIX3AIPAASYIJ3CISDI.RQKEIIT)GIHSSSDIWII.YNDAVE 
SCI QDP SAEGI*SE VVFEEliPWFEDVAVYFTREEWGMLDKR 
Qx^LYRDVMRMNYELLASLGPAAAKPDLISKLF^ 
GPKWGKGRP PGNKKMVAVREADTQASAADSALIiPGS PVEARAS C 
CSSSICEEGDGPRRIKRTYRPRSIQRSVJFGQFPWI*VIDPKETKI* 
FCSACI ERPNLHDKSSRLVRGYTG PFKVETIiKYHEVS KAHRiCV 
NTVE IKEDT PHTALVPE I SSDLMANMEHFFNAAYS IAYHSRPLN 
DFEKILQLLQSTGTVILGKYRNRTACTQFI KYISETLKREILBD 
VRNS PCVSVLLDSSTDAS EQACVGI YIRYFKQMEVKES YITLAP 
LYSETADGYFETI VSALDELDI P FRKPGWWGLGTDGSAMLSCR 

rrt.VPIfPnTfVT DOT.T.t5\7T-Tr r \72^MT? T.KT.ZXVVDAPG^TDT .VTCXr*nRH 

IRTVFKF YQS SNKRLNELQEG AAPLEQE 1 1 RIiKDLNAVRWVAS R 

RRTLHAIJLtVSWPAIARHLQRVAEAGGQIGHRAKGKLKLMRGFHF 

VKFOiFLU>FLSIYRPI^EVCQKEIVLITEVNATlJGRAYVAI*ES 

LRKQAGPKEEEFNAS FKDGRXKG I CUDKLE VAEQRFQADRERTV 

LTGIEYIjQQRFDADRPPQLI<NMEVFlDT^tAWPSGIEIiASFGNDDI 

IjNLARYFECSIiPTOTSEEJu^EEWI^LKTIAQHI^FSM 

AQHCRFPJjLS KIMAWVCVP 1 5TSCCERGFKAMNRIRTDERTKI* 

SNEVLNMLMMTAVNGVAVTBYDPQPAlQHVryLTSSGR^ 

CAQ VPARSPAS ARLR ICEEMGAL YVEE PRTQ KP P ILPSREAAEVL 

KDC IMEPPERiLYPHTSQEAEGMS 


5505 


3312 


1219 


NCS PRSbSAAja^SNRNNNKLPSMLPQLQNLI KRDPPAY I EEFLQ 
Q YiWHYKSNVE I FKLQ PN KP S KELAEL VM FMAQ I S HCYPE YLSNF 
PQEVKDIiSO^TVLDPDI,RWrFCKAIilLIiRl^ 
FFEl*FRCHDKI*IiRKTL YTHI VTD I KN INAKH KNNKVNWLQNFK 
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SBQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corre spondi ng 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 

nucleotide 

location 

c orre spondi ng 

to first 

amino acid 

residue of 

amino acid 

sequence 


Amino acid segment containing signal peptide 
(A= Alanine, C-Cysteine, D=sAspartic Acid, E=» 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I-Isoleucine, K=*Lysine, 
L=Leucine , M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrcsine, X=Unknown, +-Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 




■ 

i 




YTMLRDSNATAAKMSLDVMXEL YRRN1WNDAKTVNV I TTACFS K 
VTKI L> VAAL TF FIXBKDEDBKQDSDS ESEDDG PTARD LL VQ YATG 
KK5SKNKKKLEKAMKVLKKHRKKXKPEV 

EKILKQIjE CCKERFEVKMMLMNlil SRLVG IHELfX FNFYPFTjQR 
FLQPHQRBVTKILLFAAOASHHLVPPEI IQSLLMTVANNFVTDK 
NSGEVMTVG IWAIKEI TARCPIJU*1TEEIJliQDLAQYJCTHKDKNWI 
MS ARTL I RL FRTLNPQ MLQ KK FRG KPTEAS I EARVQEYGELDAK 
DYIPGAEVLEVEKEENAENDEDGWESTSLSEBEDADGEWIDVOH 
SSDEEQQEISKKLNSMPMEERKAKAAAISTSRVXiTQEDFQKIRM 
AOMRKEIiDAAPG KSQKRKY IE I D SDEEPRG E LLS LRD I ERIjHKK 
P KSD KETRIjATAMAGKTDRKEFVRKKTKTNP FS SSTNKEKKKQK 
NFMMMR YSQNVRS KNKRS FT^KQLALRDALLKKKKRMK 


5506 


- » 
* 

! 
1 


1531 


FRGDIjCGQRGGSAPGEGGSSAWPAPAHPIjPEREREREALC PGRS 
CSGGGGEETPGTTPVWSPIjEGGGDEEIjRPNPYVRFPYRWWAVW 
LAAFPS liGAGGETPEAPPES WTQLWFFRFWNAAGYAS FMVPGY 
IXVQYFRRKNYLETGRGLCFPLVXACVFGNEPKASDEVPIiAPRT 
EAAETTPMWQALKliFCATGLQVSYLTVIGVLQERVMTRSYGATA 
TSPGERFTDSQFLVIJWRVIALIVAGLSCVIiaCQPRHGAPMYRY 
SFASLSNVIiSSWCQYEALKFVS FPTQVLAKAS KVI PVMLMG KLV 
SRRSYEHWEYLTATLISIGVSMFLI>SSGPEPRSSPATTLSGLII> 
LAGY I AFDS FTSNWQDALFA YKMS S VQMMFG VNFFS CL FTVGSL, 
LEOGAI>LEGTRFMGRHSEFAAHALLLS ICS ACGQLFIFYTIGQF 
GAAVFTI I MTXRQAFAI LLS CXLYGHTVTVVGGLGVAVVFAALL 
LRVYARGRLKQRG KKAVP VE3 PVQKV 


5507 

• 


3704 

* 

i 
1 

1 
1 

: 

i 
« 

r 


1271 

• 


PRGTRRCRPAGRASRRARRRPPCPGPA^PGSLE IGGFGTAAG KK 
VAVAD VQ FGPMRFHQDQLQ 1 /LI> VFT KE DNQ CNG FCRACE KAG FK 
CTVTKEAQAVLACFLDKHHDI 1 1 IDHRNPRQLDASALCRS IRSS 
KJ.iS EN TVI VGVVRRVDREE1*S VMPFI SAGFTRRYVENPNI MACY 
NEZJLQLEFGFATRSQIJCLRACNSVFTALENS EDAIEITSBDRFIQ 
YANPAFETTMGYQSGELIGKEI/SEVPINEKKADLriDTINS CIRI 
GKEWQGI YYAKKKNGDNIQQNVKI IPVIGQGGKIRHYVS I IRVC 
NGNNKAEKI SB CVQSDTHTDNQTGKHKD RRKG S IiDVKAVASRAT 
EVSSORRHSSMARTHSMTIEAPITKVINIINAAOESSPMPVTEA 
LDRVIiE ILRTTELYSPQFGAKDDDPHANDLVGGLr-ISDGIiRRLSG 
NEYVLSTKNTQMVSSNI ITP ISLDDVPPRIARAMENEEYWDFDI 
FELEAATHI^PLIYI^GLKMFARFGICEFLHCSESTLRSWLQI IE 
ANYHSSNPYHNSTHSADVIJIATAYFXSKERIKETIjDPIDEVAAI> 
IAATI HDVDHPGRTNSFLCNAGSELAI I#YNDTAVLESHHAALAF 
QLTTGDDKCWIFIO^MERNDYRTLRQGI IDMVLATEMTKHFEHVN 
KFVNS INKPIiATLEENGETDKNQEVINTMIjRTP ENRTIjI KRML I 
KCADVSNPCRPI^YCIEWAARISEEYFSQTDEEKQQGIiPVVMPV 
FDRNTCS I PRSQ IS FIDYF I TDMFDAWDAFVDI* P D LMQHI>DNMF 
KYWKGIjDEMKLRNLRPPPB 


5508 


1151 


691 


LSSVFSRRSASMFAVGCSMGPFI.HYWYLSLDRLFPASGIjRGFPN 
VLKKVLVIX?LVASPLI^WYFI>Gl>GCIiEGOTVGESCQKLi^ 
EFYKADWCVWPAAQFVNFLFVPPQFRVTYINGLTIiGWDTYLSYL 
KYRSPVPLTPPGCVAI>DTRAD 


5509 


1238 

i 


619 


RXSRGCQNAIjSASGPAAAAAAIMVRKLKFHEQKLLKQVDFI^NWE 

vtdhnlhei*r\0,rryrlqrredytrynqi^ravre1*arrlrdlp 
erdqfrvrasaalldklyalglvptrgs^ 

LPTVLLKLRMAQHLQAAVAFVEOX5HVRVG PDNATTOPAFIjVTRSM 

edfvtwvdsskikr^vleyneerddfdlea 


5510 

• 


96 


1195 


pagahlssgsseplvepgrgrvgarvkgerglqasgsapgrs km 
aegerqp p pdsseeappatqnfi I pkkeihtvpdmgkwkrsqay 

ADYIGF I LTLNEGVKGKiQ^TFEYRVSEAlEKLVALLNTLDRWID 
ETPPVTOPSRPG1^YRTWYAKIJ3EEAENI*VATVVPTH^ 
EVAVYLKESVGNSTR IDYGTGHEAAFAAFLCCLCKIGVLRVDDQ 
I AI VFKVFNR YLEVMRKDQKTYRMEPAGSQG VWGIiDDFQFLP FI 
WGSS QI*H>HP YliE PRHFVDE KAVNENHKD YMFLECI LFI TEMKT 
GPFAEHSNQLV^ISAVPSWSKVNOGLIRMYKAECiEKFPVIQHF 
XFGSLLPIHPVTSG 
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SEQ 
ID 

SO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, ^Phenylalanine, G^Glycine, 
H=Histidine, I=Isoleucine , K^Lysine, 
L=Leucine, M=Methionine, N«Asparag ine, 
P=Proline, Q=Glut amine , R=Arginine , 
S=Serine, T= Threonine, VWValine, 
W= Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Co don, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 


5511 


276 


1980 


KLSRVLNLPPENLI TSISAVP I SQKEE VADFQLSVDSLLEKDND 
HSRPDI Q VQAKRIiAEKEjRCDTVVS E I STGQRTVN FKI NRElyLT K 

TVT JTDV T P ^VS Q TTVC5 T .TTQFT .P ^RT ■POKTC T WRP*?*? PWVttffV yuvn 
± v Ltf^*^ v -L Cj _ajcj i\ miiixj riut ovjuryiuvi v v or ijo zt ix v j-vr\_rvc n vvj 

HIjRST I IGHFlANLKEAIiGHQVIRJNYliGDWGtXJFGLIiGTGFQb 
FGYEEKLQSNPI^HLFEVYVQVNKEAADDKSVAKAAQBFFQRLE 
LGDVQALSLWQKFRDLS I EEY IRVYKRLGVYFDE YSGESF YREK 
SQJEVLKLLESKGLLliKTl KGTAWDLSGNGDPSS I CTVMRSDGT 
SLYATRDlAAAIDRMDKVKFDTMIYVTDKGQKKHFQQVFQMLiKI 
MGYDWAERCQHVP FGWQGMKTRRGDVTFLEDVLNE IQLRMJjQN 

VFXJSRGDTGVFIjQYTHARIjHSLEETFGCGYLNDFNTACLiQEPQS 
VSILQHLIiRFDEA^YXSSQDFQPRHIVSYLLTLSHLAAVAHKTL 
QI KDSP PEVAGARLHIjFKAVRSVIJ^GMKIJLCITPVCRM 


5512 


120 


1015 


DPSLIiLTITVTGVTVLVLVLKSMNSRRREPITLQDPEAKYPLPL 
IEKEKISHNTRRFRFGLPSPDHVLGLPVGNYVQLLAICIDNELW 

raytpvssdddrgfvdli iki yfknvhpqypeggkmtqy^enmx 

IGET I FFRG PRGRLF YIIG PGNJjG IRPDyTc* fc. P KK I JjADHIioMXA 
• GGTGITPMLQLI RKITKDPSDRTRMSLI FANQTEEDI LVRKEIiE 
EIARTHPDQFDLWYTLDRPPIGWKYSSGFVTADMIKEHLPPPAK 
STLILVCGPPPblQTAAHPNLBKLGYTQDMI FTY 


5513 


2 


637 


ARWRLPSDS PR I PPAGAETPGRGSCRNYJJPSSSPPPPEPSSFPS 
P PTSRGGPGSRDTMSDSEEESQDRQLKI WLGIX3ASG KTSLTTC 
FAQETFGKQYKQTIGLDFFLRRITLPGNliNVTIiQIWDIGGQTIG 
GKMIjDKYIYGAQGVIjLVxDITNYQS fenuedw ytwkkvseese 
TQP 1»VAL VGN KI DLEHMRT I KP E KHLRFCQENG FS SH FVSAKTG 
DS VFltCFQKVAAEIIiG I KIiNKAE I EQSQR VVKAD I VN YNQ EPMS 
RTVNPPRSSMCAVQ 


5514 


1295 
t 


449 

• 


VNRPS W I MGN FRGHAI»PGT FFF I IGLWWCTKS I LKYI CKKQKRT 
CYLGSKTL FYRliE XLEG I T I VGMA1/TGMAGEQF I PGGPHLMI*YD 
YKQGHWNQLLG1WHHFTMY FFFGLLGVADIL CFT I S SIiP VS I/TKXi 
MLSNAL FVEAF I FYNHTHGREMLD I FVHQIaLVL WFL.TGLVAF1, 
EFLVRNJT^IiELIiRSSIjIbliQGSWFFQlGFVLYPPSGGPAWDLM 
DHENILFLTI CFCWHYAVTI VIVGMNYAFITWLVKSRLKRLCSS 

Ct V<j1j1jAJN Afc»K±.Vj iioJic.iir'i 


5515 


1572 


260 


FVRIi VGRGDCD PLLS VCI*TTM PLY EGLGSGGE KTAWTDLGEAF 
TKCGFAGETGPRCI I PS VI KRAGMPKPVR WQYN INTEELYS YI* 
KEPIHILYFRKLLVNPR0RRWT IESVLCPSHFRETLTRVLFKY 
FEVPSVULAPSHLMAIJ^TICIDTSAMV^^ 
VLNCWGALPIGGIOa J HKEl>ETQIJ^QCTVDTSVA^ 

NVDYPLDGEKILHILGS IRDS WEII>FEQDNEEQSVATLI LDSL 
I0CP IDTRKOIAENLWIGGTSMI^GFLHRLLAE IRYJjVEKPKY 
KKAIX5TKTFR1HTPPAKANCVAWLGGAI FGALQDILGSRSVSKE 
YYNQTGRIPDMCSLNNPPLEMMFDVGKTQPPLMKRAFSTEK 


5516 


3 


735 


NSREPPOAGPGPSPRXSPTASSFLFPWRPIASSFWMGAQGAOES 
I KAMWRV PGTTRRP VTGES PGMHRPEAMliLLLTLALLGGPTWAG 
KMYGPGGGKYFSTTEDYDHElTGIiRVS VGIjLUVKS VQVKLGDS W 
DVKLGALGGNTQE VTLQPGE Y; I TKVFVAFQAFIjRGMVMYTS KDR 
YFYFGKLDGQIS2AYPSQEGQVI.VG1YGQYQLLGIKSIGFEWNY 
PLEEPTTEPPVNLTYSANSPVGR 


5517 


246 


499 


S E I YVAMRTDSS KMTDVESGVAN FAS S ARAGRRNALP D I QS SAA 
TDGTSDLPLKLEAI^VKEnAKEKIJEKTTQDQLEKPQI^ 


5518 


3 


1375 

• 


DAWADAVA^lAVIDLNMDFPCLWIiGLL 

FliKT VAQN YSSVTHLHS I GKSVKGRJ^WVtiVVGRFPKEHRIG I P 
EFKYVANMHGDF^GREUJLHIiIDYI,VTSDGXDPEITm,INSTR 
IHIMPSMNPIX3FEAVKKPDCYYSIGRENYNQYDLNRNFPDAFEY 
NNVSRQPETVAVMKWLKTCTFVljSANLHaGAIiVAS YP FDNGVQA 
TGAIi YSRS LTPDDDVFQYLAHT YASRNPNMKKGDECKNKKNFPN 
GVTNGYSWYPLCXKMQDYNYXWAQCFEITI,EI^CCKYPREEKLP 
S FWN13NKASI»I EYI KQVH LGVKGQVFDQNGK PLPNV I VE VQDRK | 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=A1 an ine, C-Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, ^Phenylalanine , G=Glycine, 
H=Histidine, I=Xsoleucine, K=l>ysine, 
L=Leucine, M=Methionine f N=Asparagine , 
P= Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V-Valine, 
W=Tryptophan , Y=Tyrosine, X=Unknown, **=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








HICPYRTNKYGEYYUjliliPGS YI INVTVPGHDPHITKVI I PEKS " 
QNFSALKKDI LLPFQGQLDS I PVSNPSCPMI PLYKNLPDHSAAT 
KPSI»FLFI>VS IiLH I FFK 


5S19 


87 


477 


I KSKLNQQVEVQESEWR1.TEAKG PTMGKESGWDSGRAAVAAVVG 
GVVAVGTVLVAbS AWGFTSVG IAASS IAAKMMSTAAI ANGGGVA 
AGS LVAI LQS VGAAGLS VTSKV I GG FAGTALGA WLGS P PS S 


5520 


117 


943 


PTEGRQKVLKTFTVPRSAIjAMTKTSTCIYHFLVIjSWYTFIjNYYI 
SQEG KDE VKP KXLANGAJRWKYMTIiLWLLLQTI FYGVTCLDDVLK 
RTKGGKDI KFI»TAFRDLLFTTIAFPVSTFVFLAFWII*FLYNRDtj 
IYPKVUXIVI?VWI*NHAIWTFIFPITIiAEVVLRPHSYPSKKTGIj 
TLLAAAS XAY I SRI I»WI>YFETGTWVY PVFAKLSIiljGIiAAF FSLS 
YVFrASIYLLGEKLNHWKWVSVQILQRWRLESVGICFQWPDWKS 
PAKHQLVKNIR 


5521 


54 6 


911 


KI LNMQKSCEENEGK pqnmpkaeedrpledvpqeaegnpqpsee 
gvsqeaegnprggpnqpgqgfkedtpvrhldpeemirgvdbler 
lreeirrvrnkfvmmhwkqrhsrsrpypvcfrp 


5522 


1224 


637 


GSRPXjGQRSREKMWVFGYGSLIWKVDFPYQDKLVGYITOTSRRF 

wqgstdhrgvpgkpgr wtl»ved PAG CVWG VAYRLFVG kee evk 

AYLDFREKGGYRTTTVIFYPKDPTTKPFSVIjLYIGTCDNPDYL^ 

paplediaeqifnaagpsgrnteylfelansirnlvpeeadehl . 
fai£klvkeri,egkqni*nci j 


5523 


3 

* 


1280 


SKGKIOWGSSMSAATARRPyFDDKEDVNFDHFQILRAIGKGSFG 
KVC I VQKRDT E KM YAMKYMNKQQC I ERDEVRNV FR ET.F.ILQEIE 
HVFLVNLWYS FQDEB15MFMWDLLLGGDLRYHLQQNVQFSEDTV 
RLY I CEMAIiALD YTjRGQHI IHRDVKPDNILLDERGHAHLTDFNI 
ATI I XDGERATALSGTKPYMAPEI fhsfvnggtg ysfe vdwwsv 
GVMAYEIjLRGWRPYI>IHS SNAVESI*VQI*FSTVS VQYVPTWS KEM 
VALLRKLJjTVNPEHR LSSIjQD VOAAPAIiAGVI*WDHLSE kr ve pg 
FVPNKGRI*HCDPTFEL£EMII^SRPIjHKKXKR^ 
SSQSEITOYLQDCLDAIQQDFVIFNREKI.KRSQDLPREPLPAPES 
RDAAEPVEDEAERSALPMCGPICPSAGSG 


5524 


85 


2318 


RERERDHRPGESSQGQSGAGGCFPSPTMEIiRCGGltLiFSSRFDSG 
NLAHVEKVES LS SDGEGVGGGASAI/TSGI ASS PDYE FNVWTRPD 
CAETEFENGNRSWFYFS VRGGMPGKIi I KINIMNMNKQSKIjYS QG 
MAPFVRTLPTRPR WER I RDR PTFEMTETQFVLS FVHRFVEGRGA 
TTF FAFCYPFS YS DCQEI*IiNQLDQRFPENHPTHS S PLDT I Y"YHR 
EIXCTSLDGLRVDIJ:TITSCHGIjREDREPRX.EQI.FPDTSTPRPF 
RFAGKRIFFXSSRVHPGETPSSFVFNGFLDFILRPDDPRAQTIiR 
RLFVFKLI PMLNPDGWRGHYRTDSRGVNLNRQ YLKPDAVDHPA 
lYGAKAVLliYHHVHSRIiNSQSSSEHQPSSCLiPPDAPVSDljEKAN 
NLO^fEAQCGHSADRHNAEAWKOTEPAEQKLNSWIMPQQSAGliE 
ESAPDTI PPKESGVAYYVDLHGHASKRGCFMYGNSFSDESTQVE 
NMLYPKLISLNSAHFX>FQGCKFSEKNMYARDRRI)GQSKEGSGRV 

AI YKASGI IHS YTLECNYNTGRSVNS I PAACHDNGRASPPPPPA 
FPSRYTVELFBQVGRAMAIAAI^MABCNPWPRIVLSEHSSI»TNL 
RAWWL KHVRNSRG I^STLNVG VNKKRGLRTPPKSHNGLPVS CS E 
NTLSRARSFSTGTSAGGSSSSQQNSPQMKNSPSFPFHGSRPAGI* 
PGLGS STQKVTHRVLG PVRGKPVWEPLQHVFGCIiGHCWGK 


5525 


10S 


834 


SNTLDFERHLFIMGQQISDQTQLVINKL.PEKVAKHVTLVRESGS 
LTYEEFI^RVAELNDVTAKVASGQEKHLLFEVQPGSDSSAFWKV 
VVR^A/CTKINKSSGIVEASRIMNLYQFIOliYKDITSQAAGVXtAQ 
SSTSEEPDENS SS VTSCQASLWMGRVKQliTDEEECCI CMDGRAD 
L I LP CAHS FCQKCI DKWSDRHRNCPI CRI*QMTGANESWVVSDAP 
TEDDMAN Y I LNMADEAGQPKRP 


5526 


3 


853 


RRPCN PVRAAKRTGAAARA PRGIJ2 VTMLR VAWRTLS L I RTRAVT 
QVLVPGLPGGGSAKFP FKQWGLQPRSLLLQAARGYWRKPAQSR 
I^DDPPPSTIXKDYQNVPGIEKVDDVVKRIiSLEMANKKEMLKJ 
KQEQFMKKIVANPEDTRSLEARI IALS VKI RS YEEHLEKHRKDK 
AHKRYIiLMS IDQRKKMIiKNI»RNTNYDVFEKI CWGLG I E YTFP PZ* 
YYRRAHRRFVTKKAliCIRVFQETQKIiKKRRRAIiKAAAAAQKQAK 
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to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A= Alanine , C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K= Lysine, 
L=Leucine, (^Methionine, N^Asparagine, 
P^Proline, Q=Glut amine, R=Arginine, 
S=Serine, T=Threonir.e , V=Valine, 
W=Tryptophan, Y=Tyrosine, X=0nfcnown, +=Stop 
Codon, /-possible nucleotide deletion, 
\=possil>le nucleotide insertion) 








RRNPDS PAKAI PKTLKDSQ 


5527 


3225 

• 


565 

- 


LLR XYLLH QN PLLLRHQPNRTC I S FSATMXLXDTKS RP KQSS CG 
KFOTKGIKVVGKWKEVKIDPNMPADGCMDDLVCFEELTDYQLVS 
PAKNPSSLFSKEAPKRKAQAVSEEEEEEEGKSSSPKKKI KLKKS 
KNVATEGTSTQKEPEVKDPEIjEAQGDDMVCDDPEAGEMTSENLV 
QTAP KKKKNKG KKGLE PSQSTAAKVPKKAKTW I PEVHDQKADVS 
AWKDLFVPRP VLRALS PU3FSAPTP IQALTLAPAXPO^KLDILGA 
AETGSGKTIiAFAlPMIHAXajaWQK^NAAPPPSNTEAPPGETRTE 
AGAKTRS PG KAEAESDALPDDTVI ESEALPSDIAAEARAKTGGT 
VSDQALLFGDDDAGEGPS SLIRE KPVPKQNENEEENLDKEQTGN 
LKQELDD KS ATCKAYP KR PLLGLVLTPTRELAVQVKQH IDAVAR 
FTG I KTAILVGGMSTQKQQRMLNRRPE I WATPGRLWELI KEKH 
YHLRNI^QLRCLVVDEADRMVEKGHPAELSQI^EWLNDSQYNPK 
RQTLVFSATLTLVHQAPARILHKKHTKKMDKTAKLDLLMQKIGM 
RGKPKVTDLTRNEATVETLTETKIHCETBEKDFYIjYYFLMQYPG 
RSLiVFANS I SC I KRLSGLIiKVLDIMPLTLHACMHQKQRLRNIiEQ 

farledcvi*iatdvaargldipkvqhvihyqvprtseiyvhrsg 
rtaratnrglslmligpedvinfkki yktlkkdedi plfpvqttk 
ymdwkeri rlarq ie kse yrnfqaclhnswieqaaaalei ele 
ed^kggkadqqeerrrqkqmkvi.kkelrhll^qplftesqktk 
yptqsgkppllvsapsksesals clis kqkickktkkpkepqpeqp 

QPSTSAN 


5528 


3 


895 


GPFLSACRMWGACKVKVHDSliATIS I TLRRYLRLGATMAKSKFE 
YVRJ3 PEADDTCIJ^CWVWRLDGRNFHRFAEKHNFAKPNDS RAL 
QIMTKCAQTVMEEIjEDIVIAYGQSDEYSFVFKRKTNWFKRRASK 
FMTHVAS Q FAS S YVF YVIRD Y FE DQ P L.L. Y P PG FDG R WVYP SNQT 
LKDYLS ^QADCHINNLYNTVFWALIQQSGLTPVQAQGRIjQGTL. 
AADKNEILFSEFNINYNNEPPMYRKGTVLIVJQKVDEVMTKEIKL 
PTEMEGKKMAVTRTRTKPCKPSHLPRAPCLRWL 


5S29 


46 


640 


TFRI*VSAHIiKTRKLINPEAAFJ?RWRDWDSRQGH1»SVKMQRVSGIj 

lswtlsrvlwlsgiusepgaarqprimeekalevydlirtirdpb 
kpotleelewsescvevqeineeeylviirftptvphcsliatl 
iglclrvklqrclpfkhicleiyisegthsteedrnkqindkerv 

aaamenpnlreiveqcvlepd 


5530 


4541 


2606 


AQIVHAISYCHKLHVGHRDIiIOPENVVFFEKQGLVKIiTDFGFSNK 
FQPGKKLTTSCGSIiAYSAPElLLGDEYDAPAVDIWSLGVILFML 
VGGQP P FQEANDSETLTM IMEX2K YTVPSHVS KECKDLI trmlqr 
DPKRRASLEEIENHPWI^GVDPSPATKYNIPI.VSYKNLSEEEHN 
S 1 1 QRMVLGD IADRDAI VE A r»ETNRYNH ITAT YFLLAER I LREK 
QEKEIQTRSAS PSNI KAQFRQSWPTKI D VPQDLEDDLTATPLSH 
ATVPQS PARAADS VLNGHRS KGLCDSAKKDDLPEIAGPALSTVp 
PASLKPTASGRKC1.fr VEEDEEEDEEDKKPMSLSTQWLRRKPS 
VTNRLTSRKSAPVLNQI FEEGESDDEFDMDENLPPKLSRLKMNI 
ASPGTVHKRYHRRKSQGRGSSCSSSETSDDDSESRRRLDKDSGF 
TYSWHRRDSSEGPPGSEGDGGGQSKPSNASGGVDKASPSENNAG 
GGSPSSGSGGNPTNTSGTTRRCAGPSNSMQLASRSAGELVESXjK 
LMSLCLGSQLHGSTKYI IDPQNGLS FSS VKVQEKSTWKMCI SST 
GNAGQVPAVGG I KFFSDHMADTTTELERI ksknlknnvlqlplc 

ektisvniqrkpkegllcasspascchvi 


5531 


24 


515 


gsopraprprdsmerpepeliroswravsrsplehgtvlfarlf 

ALEPDLLPLFQYNCRQFSS PBDCT.SS PE FLDHIRKVMLVIDAAV 
TOVEDLSSLEEYLASI^KHRAVGVKLSS FSTVGESLLYMLEKC 
IiGPAFTPATRAAWSQLYGAVYQAMSRGWDGE 


5532 


3395 


1402 

< * 


SDWMWGKRKM I IEDETEFCGEELLHSVLQCKSVFDVLDGEEt4R 
RARTRAN P YEK I RG VTFLNRAAMKMANMD FVFDRM FTNPRD S YG 
KPLVKDREAEIJj YFAD VCAGPGGFSE YVLWR 
GPMDFKLEDFYSASSELFEPYYGEGG idgdgditrpenisafrn 
FVIJWTDRKGVHFLMAIXKJFSVEGQENLQEILSKQLLL^ 

i^i^rtgghficktfdlftpfsvglvyllyccfervclfkp ITS 

RPANSERYWCKGLKVGIDDVRDYXFAVN I KLNQ LRNTDSDVNL 
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ID 

NO: 
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beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corre sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=sHistidine, I-Isoleucine , K-Lysine, 
L=Leucine, M=Methionine, N=»Asparagine, 
P-Prol ine , Q«K3lu t amine , R^Arginine , 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y= Tyros ine, X=Unknown, *=stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








WPLEVIKGDHE FTD YM I RSNESHCSLQI KA1AKIHAFVQDTTI* 
SEPRQAEIRKECLRLWGI PDQARVAPSSSDPKS KFPELIQGTE1 
DIPSYKPTLLTSKTLEKIRPVFDYRCMVSGSEQKFLIGIjGKSQI 
YTWDGRQSDRWI KLDL KTELPRDTULS VE IVHELKGEGKAQRKI 
SAI H ILDVLVLNGTDVREQHFNQRI QLAE KFVKAVS KPSR PDMN 
PIRVKEVyRLBEMEKIFVRLEMKI I KGSSGTPKLSYTGRDDRHF 
VPMGLYI VRTVNE PWTMGPS KS FKKKFFYNKKTKDSTFDL PADS 
I AP FH I CYYGRli FW EVJG DG I RVHD S Q KPQDQD KL.S KED VT>S P I Q 
MHRA 


S533 


94 


789 


MKERRAPQP WARC KLV1, VGD VQCG KTAMLQVIlAKDCY P ETYVP 
TVFENYTACLETEEQRVELS LWDTSGS PYYDNVRPLCYSDSDAV 
LLCFDISRPETVDSALKKWRTEILDYCPSTRVLLIGCKTDLRTD 
I^L^LSHQKQAPISYEQGCAIAKQIjGPEIYLEGSAFTSEKS I 
HS1FRTASMLCUJKPSPLPQKSPVRSLSKRLLHLPSRSELISPT 
FKKEKAKXGS I M 


5534 


3 


605 


LVRGRARAANPGRVGAMDGLRQRVEHFLEQRNLVTEVLGALEAK 
TGVEKRYLAAGAVTLI^ L YIjLFG YGASLIjCNL I G FVYPAYAS IK 
AIESPSKDDDTVWI>TYVT\ATyAIiFGI»AEFFSDI#I*I*SWFPFVYVGK 
CAFI^FCMAPRPWNGALMLYQRVVRPLFLRHHGAVDRIMNDLSG 
RALDAAAGITRNVKPSQTPQPKDK 


5535 


1029 


332 


KSFMDS EARLCS 1*VEL»S DTQDETQ KS DS ENEDLKI DCLQES QEL» 
NLQKLKNSERI LTEAKQ KMRE LTVN I KMKEDLI KELIKTGNDAK 
SVSKQYTLKVTKLEHDAEQAKVELTETQKQLQELENKDLSDVAM 
KVKLQ KE FRXKVDAAJKItRVQVTjQKKQ QDS KKLAS LSI QNE KRAN 
ELEQSVDHMKYQKIQIjQRKLQEENEKRKQLDAVI krdqqki kvi 
LSYI PAKYNMKC 


5536 


942 


282 


AAATAAS LS PRGCR LRT PS S DVS PSRA? P PS AAPLPTGRAQMS P 
SGR1»CI*1»TI VGIj 1 L PTRGQTI*KDTTS S S SADAT I MDIQVPTRAP 
DAVYTEI»QPTS PTPTWF ADETPQPQTQTQQIxBGTDGPliVTD PET 
HKSTKAAHPTDDTTTI*SER P S PSTDVQTDPQTIjKP sgfheddp f 
FYDEHTLRKRG I» L VAAVLF I TG 1 1 1 L TSG KCRQ I>S RLiCRNHCR 


5537 


3 

i 

- 


2391 


rarvsspqlrvfrsgrprrlrvlrinrtsvalriagtgrfvakt 
pghpgswemglltfrdvavefslbewehlepaqknlyqdvmlen 
yri^vslglwskpdlitfleqrkepwnvkseetvaiqpdvfsh 
YNKDUjTEHCTBAS fqkvi srrhgs cdlenlhlrkrwkreeceg 
hngcydektfic^dqfdessveslfiiqqilsscaksynfdqyrkv 
pthssli4nqqeeidiwgkhhiydktsvi1frqvsti1nsyrnvfig 
eknyhcnns ektiinqs ss pknhqeny flekqykcke fsevftiqs 
mhgqekqeqsykonccvevctqslkhi qhqtihirens ys ynky 
dkdlsqssnlrkq 1 1 hnebxp y kcekcgdslnhslhltqhq 1 1 p 

TEEKPYKWKECGKVF^TLiNCSIjYI»TKQO^IDTGEN^YKCKACS^ 
FTRSSNLIVHQRIHTGEKPYKCKECGKAFRCSSYliTKHKRIHTG 
EKPYKCKECGKAFNRSSCLTQHQTTHTGEKLYKCKVCSKSYARS 
SIOjIMHQRVHTGEKP YKCKECGKVFSRS S CLTQHRK IHTGENL Y 
KCKVCAKPFTCFSNIjIVHERIHTGEKPYKCKECGKAFPYSSHLI 
RHHRIHTGEKPYKCKACSKSFSDSSGLTVHRRTHTGEKPYTCKE 

o;kafsyssdviqhrrihtgqrpykceec»kafnyrsyl,tthqr 

SHTGERPYKCEECGKAFNSRSYIjTTHRRRHTGERPYXCDECGKA 
FS YRS YI/TTHRRSHSGER PYKCEECX5KAFNSRS YL IAHQRSHTR 
EKL 


5538 


926 


161 


H^MMMKIPWGSIPVIJ^LiIjLI^IjIDISQAQLSCrrGPPAIPGIPG 

I PGTPGPDGQPGTPGI kgekglpglagdhgefgekgdpgipgnp 
gkvgpkgpmgpkggpgapgapgpkgesgdykatqkiafsatrti 
*t\n>lrrdqtirfdhvitnmnnnyeprsgkftc 
ssrgnlcvnlmrgreraqkvvtfcdyaynt fqvttggmvlkleq 
genvflqatdknsllgmegans i fsgfllfpdmea 


S539 


38 


1258 


HRGPSGAAAPGCALPRGQALEGPRSCRRPQPMARRYDELPHYPG 
IVDGPAALAS FPETVPAVPGPYGPHRPPQPLPPGljDSDGLKRBK 
DEIYGHPLFPLLALVFEKCEIATCSPRDGAGAGLGTPPGGDVCS 
SDSFNEDIAAFAKQVRS ERPL. FS SNPELDNLVIQA3 QVLRFHLL 
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ID 
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nucleotide 
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corre sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corre spondi ng 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E~ 
Glutamic Acid, F= Phenyl alanine , G=Glycine, 
H=Hietidine, I=Isoleucine , K=Lysine, 
L- Leucine, M*=Methionine, N^Asparagine, 
P= Proline, Q=Glutatnine, R=Arginine, 
S=Serine, T=Threonine. V-Valine, 
W-Tryptophan , Y= Tyro sine. X=UnJcnovm, *=Stop 
Codon, /^possible nucleotide delebion, 
\=possible nucleotide insertion) 








BLBKVHDLCDNFCHRY I TCLKGKMPIDLVI BDRDGGCREDFEDY 
PASCPSI^DQJ^MWIRDHEDSGSVHLGTPGPSSGGLASQSGDNS 
SDQGDGLDTS VAS PSSGGEDEDLDQER3RNKKRG I FPKVATNIM 
RAWLFQHLSHPYPS EEQKKQLAQDTG LTXLQVNNWFI NARRRI V 
QPM I DQSNRTGQG AAFS PEGQPIGG YTETQPHVAVRPPGS VGMS 
LNLEGEWKYL 


5540 


148 


1440 


PPI/3AGAGVHARSPHPARRLPLTTAGVGGRAPDLLPTPWRQHRG 
PSGAAAPGCALPRGQALEGPRSCRRPQPMARR YDELPHYPG I VD 
GPAAIASFPETVTAVPGPYGPHRPPQPLPPGIiDSDGLKRBKDEI 
YGHPLFP LLALVFEKCELATCS PRDGAGAGLGTPPGGDVCS SDS 
FNEDNT AF AKQ VR S E R PIj F S SN P ELDNLMI Q Al Q VLiR FHLLELE 
KGKMP1DLVIEDRDGGCREDF2DYPASCPSLPDQNNIWIRDHED 
SGSVHLGT PGPS SGGLASQSGDNSS DQG VGIjDTS VAS PSSGGED . 
EDLDQEPRRNKKRG X FPKVATN I MRAWLFQHLSHPYPSEEQKKQ 
IAQDTGLTILQVNNWFINARRRIVQPMIDQSNRTGC^AAFSPEG 
QPIGGYTETEPHVAFRAPASVGDEFGTRKEEWHYL 


5541 


143 


1440 

• 


PPLGAGAG VHARS PHPARRLPLTTAGVGGRAPDLLPTPWRQHRG 
PSGAAAPGCALPRGQALEG PRS CRRPQ PMARRYDELPH YPG I VD 
GPAALASFPETVPAVPGPYGPHRPPQPLPPGLDSDGLKREKDEI 
YGHPLFPLLALVFEKCELATCS PRDGAGAGLGTPPGGDVCS SDS 
FNEDNTAFAKQVRS ERPLFS SNPELDNLMI QAI QVLRFHLLELE 
KG KMP I DI>V I EDRDGG CREDFED YPAS CPS LPDQNNI WIRDHED 
SGSVHLGTPGPSSGGLASQSGDNSSDQGVGLDTSVASPSSGGED 
EDLDQEPRRNKKRGIFPKVATNIMRAWLFQHLSHPYPSEEQKKQ 
IAQDTGLTI WVNNWFINAKRRI VQPMIDQSNRTGQGAAFS PEG 
QP I GGYT ETE PHVAFRAP AS VGDE FGTRKE E WHYL 


5542 


14 8 


1440 


PPLGAGAGVHARSPHPARRLPLTTAGVGGRAPDLLPTPWRQHRG 
PSGAAAPGCALPRGQAL.EGPRSCRRPQPMARRYDEL.PHYPG I VD 
GPAALAS FPETVPAVPGP YG PHRPPQPLP PGLDSDGLKREKDE I 
YGHPI^PLLALVFEKCEIATCSPRDGAGAGIjGTPPGGDVCSSDS 

FNEDNTAF AKQVRSER PLFSSNPELDNLM I QAI QVLRFHLLELE 
KGKM P I DLV I EDRDGG CREDFED Y PASCPS LPDQNNI W IRDHED 
SGSVrlLGTPGPSSGGLASQSGDNSSDQGVGLDTSVAS PSSGGED 
EDLDQEPRRNKKRGI FPKVATNIMRAWLFQHLSHPYPSEEQKKQ 
LAQDTGLTILQVNNWFINARRR I VQPMIDQSNRTGQGAAFS PEG 
QPIGG YTETEPH VAFRAPASVGDEFGTRKEEWH YTj 


5543 


2405 


665 


R WVREQPW PLRTS EAVKT PALRP FPGPRGVS P F PKPDVJGKS PAP 
KRPFS DSGAFWS PERRPG VZiEAPRRRPVPAS FRAVPPKPTRVHG 
SSASRDRVLARTMIVADSECRAELKDYLRFAPGGVGDSGPGEEQ 
KBS RARRG PRG PSAFI P VEEVLREGAESLEQHLGLEALMSSGRV 
DNIA>A7MGLHPDYFTSFWRLHYLLIjHTDGPLAS SWRHYIAIMAA 
ARHQCSYLVGSHMAEFLQTGGDPEWLLGl^RAPEKLRKLSE INK 
LIJ^PWLITKEHIQALLKTGEHTWSIAELIQALVIJjTHCKSLS 
SFVFGCGILPEGDADGS PAPQAPTPPSEQSS PPSRDPLNNSGGF 
F^ARDVEALMERMQQLQESLLRDEGTSQEEMESRFELEKSESLL 
VTPSAD ILEPS PHPDMLCFVEDPTFGYEDFTRRGAQAPPTFRAQ 
DYTWEDHG YSLI QRLYPEGGQLLDEKFQAAYSLT YNTIAMHSGV 
DTSVLRRAIWNY I HCVFG I RYDD YDYGEVNQLLERNLKVY I KTV 
ACYPEKTTRRMYNLFWRHFRHSEKVirvmjIiLLEARMQAALLYAL 

RAITRYMT 


5544 


1895 


514 


I^MLLGRQRI^LRrroAGRIXSAPMERHGRAS A J. £> VS i> /UaJsUAACsLi 
PEGRRQEPIjRRRASSASVPAVGASAEGTRRDRLGSYSGPTSVSR 
QRVESIoRraCRPLFPWFGLDIG^TLVKLVYraPKDITAEEEEEEV 
ESLKS IRKYLTSNVAYGS TG I RDVHLELKDLTLCGRKGNLHF I R 
FPTHDMPAFIQMGRDKNFSSLHTVFCATGGGAYKFEQDFliTIGD 
LQLCKLDELD CLI KGI LiYI DSVGFNGRS QCY YFENPADSEKCQK 
LPFDLKN P YPLLLVN IGSGVS ILAVYS KDN YKRVTGTSLGGGTF 
FGLCCI^TGCTTFEEALEMAS RGDS TKVDKLVR DI YGGDYBRFG 
LPGWAVAS S FGNMMS KEKREAVS KEDLARATL I T I TNNIGS 1AR 
MCAI^ENINQVVFVGNFIJ^NTIAMRIXAYA^ 
SEHEG YFGAVGALLELLKI P 
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Predicted end 
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amino acid 
sequence 


Amino acid segment containing signal peptide 
(AsAlanine, C=Cysteine, D*=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine , G=Glycine, 
H-Histidine, I=»Isoleucine, K=Lysine, 
L= Leucine, M-Methionine, N-Asparagine, 
P= Proline . Q=Glu t amine . R=Arqinine. 
S=Serioe, T=Threonine, V= Valine, 
WsTryptophan, Y= Tyro sine, X=Unknown , *=scop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 


5545 


802 


131 


GAMWSAGRGGAAWPVLLGLIJUAIjIjVPGGGAAKTGAELVTCGSVL» 
KIJ^THHRVPaJiSnDlKYGSGSGO^SVTGVEASDnANSyWRIRG 
GSEGGCPRGS PVRCGQAVRLTK^TGKNLHTHHFPSPLSNNQEV 
SAFGEIXSEGDDlJDLWTVRCSGCHWERE^VRFQHVGTSVyi^VT 
GEQYGSPIRGQHEVHGMPSANTHNTWKAMEGIFIKPSVEPSAGH 
DEL 


SSiS 


1592 


146 


FVPRGGHSS MGOSGP^RHOKRARAOAOLRNLEAYAANPHS FVFT 

RGCTGRNIRQLSLD^RVMEPLTASRLQVRKKWSLKDCVAVAGP 

LGVTHFL I LS KTETNVYFKLMRLPGG PTLTFQVKKYS LVRDWS 

SLPJlHRMHEC^FAHPPIOiVl^SFGPHGMHV^ 

NVHKVNLNT IKRCLLI D YNPDSQELDFR HYS I KWPVG AS RGMK 

KLLQEKFPNMS RLQD IS E LLATG AG LS ESEAEPDGDHN I TELPQ I 

AVAGRGNMPJ^QQSAVRLTEIGPR^LQLIKVQEGVGEGKVMFHS i 

FVS KTEEELQAI LEAK^KKLRLKAQRQAQQAQNVQRKQBQREAH 

RKKShEGMKKARVGGSDEEASGlPSRTASLELGEDDDEQEDDDI 

EYFCQAVGEAP S EDLFPEAKQKRLAKS PGRKRKRWEMDRGRGRL 

CDQKFPKTKDKSQGAQARRGPRGASRDGGRGRGRGRPGKRVA 


"7 




XH O 


RGCTGRJTXRQLSLDVRRVMEPLTASRLQVR KKNS LKDCVAVAGP 
LGVTHFL I I>SKTETNVYFKLMRLPGGPTLTFQVKKYSLVRDVVS 
SLRRHRMHEQQFAHP PLLVLNSFG PHGMHVKLMATMFQNLFPS I 
NWKVNLNTIKRCLLIDYNPDSQELDFRHYS 1 KWPVG AS RG MK 
KLLQEKFPNMS RLQD IS ELLATGAGLSESEAEPDGDHN I TELPQ 
AVAGRGNMRACK3SA VRLTE I GPRMTLQLIKVQEGVGEGKVMFfiS 
FVS KTEEELQAI LEAKE KKLRLKAQRQAQQAQNVQRKQEQREAH 
RKKS LEGMKKARVGGSDEEASG IPS RTA55 L ELG EDDDEQEDDDI 
EYFCQAVGEAPSEDLFPEAKQKRLAKS PGRKRKRWEMDRGRGRL 
CDQKFPKTKDKSQGAQARRG PRGASR DGGRGRGRGRPGKRVA 


5548 


1 


2153 


DQTG P PETI AFT FPRSTMEPLCPLLLVG FSLPLARALRGNETTA 
DSNETTTTSGPPDPGASQPLLAWLLLPT .T iLLLLVLLLAAY FFRF 
RKQR1CAVVSTSDKK2V1PNGILEEQEQQRVMLLSRSPSGPKKYFPI 
PVEHLEEEIRIRSADDCKQFREEFNSLPSGHIQGTFELANKEEN 
REKNRYPNILPT3DHSRVILSQLDGIPCSDYINASYIDGYKEKNK 
FIAAQGP KQETVND FWMVWEQKS AT I VMLTNLKERKE EKCHQY 

LVSQLHFTSWPDFGVPFTP IGMLKFLKKVKTLNPVHAGPI WHC 
S AGVGRTGTFI VIDAMMAMMHAEQKVD VFEFVSR I RNQRPQMVQ 
TOMQYTFIYQALLEYYXYGDTEIJDVSSI^KHLQTOHGTTTHFDK 
IGLEEEFRKLTNVR IMKENNRTGNLPANMKKARVIQI I PYDFNR 
VILSMKRGQEYTDYINASFIDGYRQKDYFXATQGPLAHTVEDFW 
RMIWEWKSHTIVMLTEVQEREQDKCYQYWPTEGSVTHGEITIEI 
KNDTLSEAJSIRDFLVTLNQPQARQEEQVRWRQFHFHGWPE IG 
IPAEGKGMIDLIAAVQKQQQQTGNHPITVHCSAGAGRTGTFIAL 
SNILBRVKAEGLLDVFQAVICSLRLQRPHMVQTLEQYEFC^KVVQ 
DFIDI FSDYANFK 


5549 


915 

- 


256 


FEATGGKRLAFK>1AGTARHDREMAIQAKKKLTTATDPIERLRLQ 
CLARGS AG I KGLGRVFRIMDDDNNRTLDFKEFMKGLNDYAWME 
KEEV^ELFQRFDKDGNGTIDFNEFXJ.TLRPPMSRARKEVIMQAF 
RKLDKTGDGVI T I EDLREVYNAKHHPKYQNGE WS E EQVFRKFLD 
NFDS P YD KDGLVT PEE FMNYYAGVSAS IDTDVYFI IMMRTAWKL 


5550 


2364 


1210 


RKRKVFLKMRRLNRKKTLSLVKELDAFPKVPESYVETSASGGTV 
SLIAFTTMALLTIMEFSVYQDT^MKYEYEVDKDFSSKLRINIDI 
TVAKKCQYVGADVLDLAETMVASArXSLVYEPTVFDI^PC^ 
RMLQLIQSRLQEEHSLQDVIFKSAFKSTSTALPPREDDSSQSPN 
ACR IHGHL YVNKVAGNFHI TVX3KAIPHPRGHAHLAALVNHES YN 
FSHRIDHLSFGBLVPAI INPLDGTEKI AI DHNQMFQY FIT WPT 
KLHTYKIS ADTHQFS VTERERI INHAAGSHGVSGI FMKYDLSSL 
MVTVTEEHMPFWQFFVRLCGIVGGIFSTTGMLHGIGKFIVEI IC 
CRFRLGSYKPVNSVPFEIX3HTDNHLPLLENNTH 


5551 


211 


1700 


MQPXHTMDYKESCPSVSIPSSDEHREKKKRFTVYKVLVSVGRSE 
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SEQ 
ID 

NO: 


Predicted 
beg inning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corre s ponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A= Ala nine, C=Cysteine, D=Aspartic Acid, S- 
Glutamic Acid, F=Phenylalanine , G^Glycine, 
H=Histidine, I=Isoleucine, K=I>ysine, 
L=Leucine, M^Mechionine, NsAsparagine, 
P= Pro line, Q=Glutamina, R=Arginine, 
S=Serine, T=Threonine, V=s Valine, 
W ^Tryptophan , Y-Tyrosine, X= Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








W FVFRR YAEFDKLY NTLKKQF PAMAJLK I PAKRIFGDNFDPDFI K 
QRRAGLNE FIQNL^/RYPELYNHPDVRAFLQMDSPKHOSDPSEDE 
DERSSQKLHSTSQNINIX5PSGNPHAKPTBFDFLKVIGKGSFGKV 
LLAKRKilXSKFYAVTCVI^QKKJLVLNRKEQXH^ 
PFLVGUiYSFQTTEKX.YFVLDFVNGGEIiFFHLQRERSFPEHRAR 
FYAAE I ASALGYLHS I KIVYRDLKPEIJILLDSVGHWLTDFGLC 
K^GIAISDTTTTFCGTPEYLAPEVIRKQPYDNTVDWWCLGAVLY 
EMLYGLP P FYCRD VAEMYDN I LHKPLSLR PGVSLTA WS ILEELL 
EKDRQNRLGAKEDFI.E IQNHPFFESLSWADLVQKKI PPPFNPNV 
AGPDDIRNFDTAFTEETVPYSVCVSSDYS XVNASVXiEADDAFVG 
FSYAPPSEDLFL 


5552 


2748 


930 

- 


IX3PAAGAAMGKKH K KHKAE WRS5 YED YADKPI*E KPIJCLVLK VGG 
SEVTELSGSGHDSSYYDDRSDHERERHKEKKKKKKKKSEKEKHL 
DDEERRKRKEEKKRKREREHCDTEGEADDFDPGKKVEVEPPPDR 

IAPGYSM 1 1 KH PMDFGTMKDKI VANEY KSVTEFKADFKLMCDNA 
MTYNRPDTVYYKLAKKI LiHAG FKMMSKQAALIjGNEDTAVEEP VP 
EWP VQVETAKKS KKPSREVI S CM FEPEGNACSIjTDSTAEEHVL 
ALVEHAADEARDR INRFLPGGKMGYIjKRNGDGSLIiYSWNTAEP 
dadeeethpvdlssls s KLLPG FTTLGFKDERRNKVTFLSSATT 
ALSMQ1WSVFGDLKSDEMEIjI»YSAYGDETGVQCAI*SIiQEFVKDA 
GS YSKKWDDIiIjDQI TGGDHSRTLFQIiKQRRNVPMKPPDEAKVG 

HLWLDFJTTKL.LQDLHEAQAERGGSRPSSNLSSLSNASERDQHHL 
GSPSRLSVGEQPDVTHDPYEFLQSPEPAASAXT 


SS53 


74 


1095 


LGREAVYLVS RMDGPVAEHAKQE PFHWTPliLESWAiSQVAGMP 
VFIjKCENVQPSGSFKI RG IGHFCQEMAKKGCRHIjVCSSGGNAGI 
AAAYAARKLG I PAT I VL PES TS IiQ WQRIjQGEGAE VQLTGKVWD 
EANLRAQEIiAKRDGWENVPPFDHPLIWKGHASLVQELKAVLRTP 
PGALVLAVGGGGULAG WAGLLEVGWQHVPI IAMETHGAHCFNA 
A I TAGKLVTLPDX TSVAKS LGAKTVAARALECMQVCKI HS EWE 
DTEAVS AVQQliI*DDERMIiVEPACGAALAAI YSGLLRRLQAEGCL 
PPSLTSVWlVCGGNNINSREUJAliKTHLGQV 


5554 


166 


2318 


CSGRTGGRGS LRPAENVCI/TCKLSGAETRGIiLC PAIiRTWIMK VI* 
GRS FFWVLF PVLPWAVQAVEHEE VAQR VI KLHRG RGVAAMQS RQ 
WVRDSCRKLSGLLRQKNAVIJSrKLKTAIGAVEKDVGLSDEEKLFQ 
VHTFE I FQKEI^ESENSVFQAWGLQRALQGDYKDWNMKESSR 
QRLEALREAAI KEETEYMEIJLiAAEKHQVEAIjKNMQHQNQSLSML 
DE IIiED VRKAADRLEEB I EEHAFDDNKSVKG VN FEAVLRVEE EE 
ANSKONITKREVEDDLGLSMIiIDSQNNQYILTKPRDSTIPRADK 
HFlKDIVTIGMLSLPOSWLCTAIGLPTMFGYIICGVIJiGPSGLN 
SIKsivQVETLGEFGVFFTLFLVGLEFSPEKLRKVWKISLQGPC 
YMTLI^IAFGLLWGI-flLJLRIK^TQSVFISTCI^LSSTPLVSRFLM 
GSARGDXEGD I DYS TVLLGMLVTQDVQIjG LFKAVMPTLIQAGAS 
ASSS I WEVLR I IWLI GQ I LFSLAAVFLLCLV I KKYLIG PYYRK 
LHMES KGNKE I LI LGISAFI FLMLTVTELLDVSMEIiGCFLAGAL 
VSSQG PWTEE I ATS I EPIRDFLAI VFFAS IGI*HVFPTFVAYEL» 
TVLVFLTLS VVVMKFLLAAL VLS L I L PRS SQ Y I KW I VS AG LAQ V 
SSF^FVLGSRARRAGVISRFTVTLLILSVTTIiSLLLAPVLWRA^ 
TR CVPRPERRSSI/ 


5555 


212 


1425 


LSLRTRETPAPPRCEAASQGRVGWRADAAAEEAVRSVWNRTRDR 
GTMAPOWLSTFCLLLLYLIGAVIAGRDFYKILGVPRSASIKDIK 
KAYRKIAliQLHPDRNP DD PQAQE KFQDLGAAYEVI»SDSEKRKQY 
DTYGEEGLKDGHQSSHGDI FSHFFGDFGFMFGGTPRGX3DRNI PR 
GSD I r^TDLEVTIvEEVYAiSNFVEVVRl^KPVARQ APG KRKCNCRQE 
MRTTX3LGPGRFQMTQEVGCDECPWVKLVNEERTLEVEIEPGVRD 
GMEYPF IGEGEPHVDGEPGDLRFRIKWKHP I FERRGDDLYTNV 
TISI* VESL VGFEMD I THLDGHKVHISRDKI TR PGAKL WiCKGEGL 
PNFDNNNI KGSLI I TFDVDFP KEQL.TEEAREG I KQLLKQGS VQK 
VYNGIjOGY 


5556 


5835 


3346 


RTRGMSKNCVP^FEEYLLRMFOGTFYLI^KITKDI^AHTVKSR 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Anuno acid segment containing signal peptide 
lA=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F» Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=L.ysine, 
L= Leucine, M=Me thionine , N=Asparagine, 
P=Proline, Q=Glut amine, R=»Arginine, 
S-Serine , T= Threonine , V=Val ine , 
W= Tryptophan , Y=Tyrosine, X= Unknown, *=stop 
Codon, /=possible nucleotide deletion, 
\sspossible nucleotide insertion) 








LEEI^ES YI EKFTD FLRIiFVS VHLRRXES YSQ FP WE FJLTLLFK 
YTFHQPTHEG YFS CI*DI WTLFLDYLTSKI KS RLGDKEAVLNRY E 
DALVIiLTEVLNRIQFRY^AQLEEIJDDET 

QSLEWAKVr^IiPTHAFSTLFPVI^DNLEVYLGLQQFXVTSGS 
GHPJ^ITAEJn>OUlIjBCSLRlXLS^ 

FNDAJLTWERLVKVTL YGSQ I KLYN I ETAVP S VTJCPDL I DVHAQ 
SIAALQAY SHWLAQYCSEVHRQNTQQFVTIil STTMDAI TPLIST 
KVQDIOJJiSACHI^VSlATTVRPVFLISIPAVQKVFNRITDASA 
I^VDKAQVLVOyU^SNlI^LPWPNLPF^EQOWPVRSIimASLI 
SALSRDYRNLKPSAVAPQRKMPLDDTiCLI IHQTLSVLEDIVENI 
SGESTKSRQI CYQSLQES VQVSIiALFPAFIHQSDVTDEMLS FFL 
TLFRGLRVQMGVPFTEQI IQTFLNMFTREQLAESILHEGSTGCR 
WEKFLKIIjQVWQEPGQVFKPFLPSI IAIiCMEQVYPI I AERPS 
PDVKAELFELLFRTLHHNV7RYFFKSWJUASVQRGIAEEQMENEP 
QFSAIMQAFGQSFIiQPDIHLFKQNLFYLETLNTKQKLYHKKIFR 
TAMLFQFVNVLLQVDVHKSHDLLQEE IG IAIYNMASVDFDGFFA 
AFLPEFLTS CT)GVDANQKSVIX;RNFKMDRVRRERGRAKRRAEWA 
RKPGTCAARRGRIEASGRGLCPPCSLAAAHEMPADLVL 


5557 


1712 


491 


VILGAGLRDKDMWI PWGLPRRIiRbSALAGAGRFCILGSEAATR 
KHLPARNH CGLSDSS PQLWPEPDFRNPPRKASKASLDFKRYVTD 
RRLAETLAQI YLGKPSRP PHIjUiECNPGPGI LTQALLEAGAKW 
ALESDKTFI PHLESIX5KNLDGKLRVIHCDFFKLDPRSGGVIKPP 
AMSSRGLFKWLGIEAVPWTADI PLKVVGMFPSRGEKRALWKLAY 
DLYSCTSIYKFGRIEVNMFIGEKEFQKLMADPGNPDLYHVIjSVI 
WQLACEIKVLHMEPWSS FDI YTRKGPI^PKRRELOJDQLQQKLY 
LIQMIPRQNIiFTKWLTPMNYNIFFHLLKHCFGRRSATVIDHLRS 
LTPIJ)ARDILMQIGKQED3KVVNMHPQDFKTLFETIERSKDCAY 
KWLYDETLEDR 


5558 


1509 

—* 

* 


96 


RAGCTHPOVPADLGiAPAE D RRPOKTCVCLLQPQPGGORGPrTMI 
TGWSMRLWTPVGVLTSIAYCLHQPJiVAIiAELQE ADGQ CP VDRS 
LLKLKMVQWFRHGARS PLKPLPLEEQVEWNPQLLEVPPQTQFD 
YTVTNLJU3GPKPYSPYDSQYHETTLKGGMFAGQLrKVGMQQMFA 
LGERLRKNYVED I PFI*S PTFNPQ E VF I RSTN I FRNLESTRCIiLA 
GLFQCQKEGPI I IHTDEADSEVIiYPNYQSCWSLRQRTRGRRQTA 
SI^PGISEDLKKVKDRMGIDSSDKVDFFITiLDIJVAAEOAHNLPS 
CPMLKRFARMIEQRAVDTSLYlliPKEDRESLQMAVGPFLHILES 
NLI»KAMDSATAPDKI RKLYXYAAHDVTFI PLLMTLGIFDHKWPP 
FAVDLTMELYQHLESKEWFVQLYYHGKEQVPRGCPDGLCPLDMF 
I^AMSVYTI^PEKY1IAI*C5QTX2VMEVGN^ 


5559 


150 


1983 


PLAATAHFAKMSRVAKYRRQVSEDPDIDSLLETLSPEEMEELEK 
EiDVVl>PIX5SVPVG^QRNQTEKQSTG\miREAMLNFCEKETKK 
LMQREMSMDE S KQVETKTDAKNGE ERGRDAS KKALGPRRDSDLiG 
KEP KRGGL KKS FS RDRDEAGGKSG EKP KEE KI I RG I DKGRVRAA 
VDK KEAGKDGRGEERAVATKKEEE KKG S DRNTGLSRDKDKKREE 
MKEVAKKEDDEKVKGERRNTT)TRKEGE KMKRAGGNTDMKKEDEK 
VKRGTGNTDTKKDDEKVKKNE^LHEKEIAKDDS KTKTPEKQTPSG 
PTKPSEGPAKVEEEAAPS IFX>EPI»ERVKNNDPEMT3VNVNNSDC 
ITNE I LVRFTE AIiEF^TVVKLFAIiANTRADDHVAFAI AIMLKAN 
KTITSLNLDSNHI TGKG I IiAIFRALLQNNTLTELRFHNQRHI CG 
GKTEP1E lAKIXKENTTIiIiKLG YHFELAG PRJ4TVTWLIiSRNMDKQ 
RQKRLQEQRQAQ EAKGE KiCDLLE V P KAG AVAKG S PKPSPQ PS PK 
PSPKNS PKKGGAPAAPPPPPPPLAPPLI MENLKNSLSPATQRKM 
GXJKVLPAQEKNSRDQIJ^AAIRSSNLKQLKKVEVPKLLQ 


5560 


9 


921 


SSVVEFSAl^VSMACljSPSQLrQKFQQDGFliVL.EGPL.SAEECVAM 
QQRIGEIVAEMDVPLHCRTEFSTQEEEQDRAQGS TDYFLSSGDK 
IRFPFEKGVFDEKGNFIiVPPEKS INKIGHAIiHAHDPVFKS ITHS 
FKVQTLARSIiGLQMPVVVQSM YIFKQPHFGGEVS PHQDAS FLYT 
EPLGRVLGVWIAVEDATLENGCLWFI PGSHTSGVSRRMVRAPVG 
S APGTS FLGSE P ARDNSL FVPTPVQRGALVLIHGEVVHKS KQNL 
SDRSRQAYTFHLMEASGTTWS PENWLQPTAELPFPQLYT 


556X 


2175 


1775 


CYFI FQFFSSPYPGIiHPHQTPAPIiPNPGLYPPPVStIS PGQPPPQ 
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SEQ 
ID 
HO: 


Predicted 
beginning 
nucleotide 
location 
corre spending 
to first 
andlno acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A= Alanine, C= Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Pnenyia±anine, bsbiycine, 
Hs=Histidine, I=Isoleucine, X=Lysine, 
L- Leucine r M=Me thionine , N= Asparagine , 
P«= Proline, Q=Glu t amine , R=Arginine, 
S=Serine, T=Threonine, V~Valine, 
W=Tryptophan, Y= Tyrosine , X -Unknown , * =Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








Vf It t Fr* 1 X V o>Vrv»Vl*iJN C lafMxro X ir jL/\x wJT-i-l Xr XT JT C C rriul "IN X yruro 

QVYGGVTYYNPAQQQVQPKPSPPRRTPQPVTIKPPPPEVVSRGS 
S 


5562 


342 


1385 


SSG KNDMAAAGAAGL VRGIiKAGVI^QAD YIi^ 
LQSTDYGN EliANEAS PLWS VIJ5DRLKEKMVVE FRHMRNHAYEP 
LAS FliD F IT YS YMl DNV ILLITGTLHQRS I AELVPKCHPLGS FE 
CMEAVNIAQTPAELYNAILVDTPJjAAFFQDCIS eqdldemniei 
IRNTLYKAYLES FYKFCTLLGGTTADAMCPI LE FEADRRAFI IT 
IKS FG TEX»S KEDRAKL F PHCGK L Y P EG LAU liAKAiJU x c»Q V KN V A 
DYYPEYK3^FEGAGSNPGDKTLEDRFFE^iEVKljNKLAFI^0FHF 
GVFYAFVKLKEQECRNI vwi aec iaqrhrakidnyi PI f 


5563 


342 


1385 


SSGKNDMAAAGAAGLVRGLKAG VLSQA\D YI^L VQCETIiEDL KIM 
I/QSTDYGNFIJVNEASPLTVSVTDDPJjKEKMVVEFRHMRNHAYEP 
LASFLDFITYS YMIDNVILLITGTLHQRSIABLVP KQIPLGS FE 
QMEAVN I AQT PAELYN AI LVDTPLAAF FQDC I S EQDLDEMN I EI 
IRNTLYKAYLES FYKFCTLLGGT 1 ADAMCP I LE F EADRRAF I IT 
INSFGTELSKEDRAKLFPHCXSRLYPEGLAQLARADDYEQVXNVA 
DYYPEYKLLFEGAGSNPGDKTLEDRFFEHEVKLIJKLAFLNQFHF 
GVFYAFVKLKEQECRNI VWI ABC IAQRHRAKIDNY I P I F 


S564 


3 


914 


RVRRD KJ^WTARGRRRCGDSMSGGWMAQV^WRTGALGLALLL 
UiGl^LGLEAAASPLSTPTSAQAAGPSSGSCPPTKFQCRTSGLC 
VTLTWRCDRDLDCSDGSDEEECRIEPCTQKGQCPPPPGLPCPCT 
GVSDCSGGTDKKLRNCSRLACLAGELRCTLSDDCIPLTWRCDGH 
PDCPDSSDELGCXSTNEILPEGDATTMGPPVTLESVTSLRNATTM 
GPPVTLES VPS VGNATSSSAGDQSGS PTAYG VI AAAAVLSAS LV 
TATLLLLS WLRAQERLR PLGLLVAMKESLLLSEQKTSLP 


5565 


993 


138 


RWNS PNPARAGS I S R PQRAPG S VS AVAMTAAV F FG CAFIAFG PA 
LALYV FT I ATE PLRI I PL 1 AGAFFWLVSLLI SSLVWFMAR VI I D 
NKIX3PTQKYLLIFGAFVSVYIQ[^FRFAYYKLLKKASEGLKS IN 
PGETAPSMRLLAYVSGLGFG I MSGV FS FVNTLS DS LG PGTVG IH 
GDSPQFFLYSAFMTLVIIIJjHVFWGIVFFLXSCEKKXWGILLIVL 
LTHLLVS AQTF I S S YYG I NLAS AFI I LVLMGTWAFLAAGGSCRS 
LKLCLLCQDKNFLLYNQRSR 


5566 


2043 


1232 


SHIQHHGRGAQAPVKMVSWMISRAWLVFGMLYPAYYSYKAVKT 

KNVKEYVRWMMYWIVFALYTVIEWADQTVAWFP 

I WT iT »fl P YTKGASL I YRKFLH PLLSS KERE I DDY I VQAKERG YE T 

MVNFGRQGLNLAATAAVTAAVKSC^^ITERLRSFSMHDLTTIQG 

DEPVGQRPYQPLPEAKKKSKPAPSESAGYGIPLKDGDEKTDEEA 

EGPYSDNEMLTHKGPRRSQSMKSVXTTKGRKFA7RYGSLKYKVKK 


5567 


1554 


233 


EFLGSGVS PDIANEIX?LTALHQCCIDDFREMVQOLLEAGANINA 

CDSECWTPLHAAATCGHLHLVEIJLIASGANLIA^ 

CDDEQTLD CLETAMADRG I TQDS I EAARAVPELRMLDDIRSRIiQ 

AGAD3JHaPLDHGATO»IiHVaAANGFSEAaALI^ 

GWEPIjHAAAYWGQVPLVEIiVAHGADIiNAKSLMDETPLDVCGDE 

EVRAKLLELKHKHDALLRAQSRQRSIARP.RTSSAGSRGKVVRRV 

SLTQRTDL YRKQHAQEAI VWQQP P PTS PEPPEDNDDRQTGAELR 

PP PPEEDNPEWRPHNGRVGGS PVRHLYS KRLDRSVS YQLSPLD 

CTTOHTT.VHnif RHirrT.anT.TrROWA PP PERPESPETAEP 

GI> PGDTVT PQ P D CG FRAGGD P PLLKLTAP AVEAP VE RRP CCLLM 


5568 


1731 


587 


AEDRQPAS RRGAGTTAAMAASGPGCRSWCLCPEVPSATFFTALL 
SLLVSGPRLFLLQQPLaPSGLTLKSEALRNWQVYRI>VTYI fvye 
NP ISLLCGAI II WRFaGNFERTVGTVRHCFFTVI FAI FSAIIFL 
S FEAVSSL S XIX;EVEDARGFTPVAFA^tI>GVTTVR^P^ 
MWPSVLVPWLIxtiGASWLIPG/rSFLSNVCGLS IGLAYGLTYCYS 
IDI^SERVALKLDCTFPFSLMRRISVFTCx^ 

NPVPGSYPTQSCHPHLS PSHPVSQTQHASGQKLASWPS CT PGHM 
PTLP P YQ PASGLC YVQNHFGPNPTS S S VY PASAGTSLG I QPPTP 
VNSPGTVYSGALGTPGAAGSKESSRVPMP 


S569 


2 


835 | QTPCPLAWERGSRSEDISVPGQKPPTCSSFSGMDVGPSSLPHl^ 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corre sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D^Aspartic Acid, B= 
Glutamic Acid, F« Phenylalanine, G=Glycine, 
H=Histidine, I-Isoleucine, K=Lysine, 
I*=Leucine, M-Methionine, N=Asparagine , 
P= Pro line, Q=Glut amine, R=Arginine, 
S=Serine, T-Threonine, v=valine, 
W= Tryptophan , Y^Tyrosine, X=Unknovn, *=Stop 
Codon, /=possible nucleotide deletion, 
\«possible nucleotide insertion) 








LKT iTtTiT iU .LLPLRGQANTGCYGlPGMPGLPGAPGKDGYPGIiPGP 
KGBPG I PAI PG I RGPKGQ KG E PGL PGHPGKNGPMG P PGWPGVPG 
PMG IPGEPGBEGRYKQKFQSVFTVTRQTHQPPAPNSLIRFKAVL 
TWPQGDYDTSTGKFTCKVPGIiYYFVYBJ^HT^ 
WTFCX3HTSKTNQVMSGG VI^RLQVGEEVWIAVN^ 
SDSVFSGFLLFPD 


S570 


264 


946 


RDRRDRGGVATSTEEPARPRAPQSRGPGPVSQTGRGRERGGGDT 
MSSPSPGKRRMDTX>WKLIESKHEVTII/K3I^FWKFYGPQGT 
PYEGGVWKVKVDLPDKYPFKSPSIGFMNKIFHPNIDEASGTV'CIj 
DVINOTl^ALYDIiTNXFESFLPQLLAYPNPIDPLNGDAAAMYLH 

RPEBYKQKIKBYIQKYATEEALKEQEEGTGDSSSESSMSDFSED 
EAQDMEL 


SS71 


264 


946 


R0RRDRGGVATSTEEPARPRAPQSRGPGPVSQTGRGRF.RGGGDT 
MSS PS PGKRRMDTDWKI.I ESKHEVTILGGI*NE FWKFYGPOGT 
P YEGGVWKVRVDLPDKYP FKS PS IGFMNKI FH PNIDEASGTVCL 
D V XNQTWTAIf YD1»TNI FES FLPQLLA YPNP I DPI*NGDAAAMYLH 
RPEBYKQKIKEYIQKYATEEALKEQEEGTGDSSSESSMSDFSED 
EAQDMEL 


5572 


2802 


2085 


RTDYRTGI PGRRFRVMAAGDGDVKLGTLGSGSESSNDGGSES PG 
DAGAAAEGGG WAAAALALLTGGG EMLLNVALVALVLLGAYRLVTV 
RWGRRGuGAGAGAGEESPATSLPRMKKRDFShEQJJIQYDGSRIW 
RI LIiAVNGKVFDVT KGS KF YG PAG P YG I FAG RDAS RGLAT F CLD 
KDALRDE YDDI*SDI>NAVOME S VREWEMQFKEKYD YVGRLLKPGE 
EPSEYTDEEDTKDHNKQD 


i 5573 

- 


2562 

• 

- 


219 

• 


VPART PNAEDQGP EARAATAT PCQSGGRERAGEAAEDGVKMAAF 
SEMGVMPEIAQAVEEMDW1J-PTB1QAESIPLILGGGDVLMAAET 
GSGKTGAFSIPVIQIVYETIjKIX^EGKKGKTTIKTGASVI.NKWQ 
MNPYDRGSAFAI GSDGLCCQSREVKEWHGCRATKGIiMKGKHYYE 
VSCHDQGLCRVGWSTMQASLDLGTDKFGFGFGGTGKKSHNTCQFD 
N YGEE FTMHDT I GCYUD I DKGHVKF S KNGKDLG1»AFE I P PHMKN 
OALFPACVIJCNAELKFNFGEEEFKFPPKI^GFVALSKAPIXJYrVK 
SDHSGNAOVTOTKFLPhLAPKALIVEPSPELtAEOTIoNNIKOFKJCY 
IDNPKLRELLI IGGVAARDQLSVLENGVDIVVGTPGRLDDLVST 
GKLNLSQVRFLVLDETUDGLLSQGYSDFrNRMHNQIPQVTSDGKR 
LQVI VCSATLHSFDVKKLS EKIMHFPTWVDLKGEDSVPDTVHHV 
WPVNPXTDRLWERLGKSHIRTDDVHAKDNTRPGAKrSPEMWSEA 
I KILKGE YAVRAI KEHKMDQAI I FCRTKIDCDNLEQYFI GXJGGG 
P DKKGKQ F S C VCLHGDRKP H ERKQNLER FKKGD VRFLI CTDVAA 
RGID1HGVPYVINVTLPDEKQNYVHRIGRVGRAERMGLAI SLVA 
TEKEKVWYHVCSSRGKGCYNTPJIJCE^ 

HLNCTISQVEPDIKVPVDEFTCKVTYGQKRAAGGGSYKGHVDIL 
APTVQEUVALEKE AQTS FUfl/3 YIiPNQLFRTF 


5574 


1731 


952 


NEGLEVFKEQELQPSDKGAVPEDASTERSAMASLGLQLVGYILG 
LLGH/3T LVAMLLP S WKTS S YVG AS IVTAVG FS KGL WME CATH S 
TGITQCDIYSTLI^LPADIQAAQAWMVTSSAISSIACirSVVGM 
RCTVFCQESRAKDRVAVAGGVFFILGGLU5FIPVAWNLHG ILRD 
F YS PLVPDSMKFE IGEAL YIX3 1 1 SSLFSL I AGI ILCFSCS CQRN 
RSNYYDAYQAQPLATRSS PRPGQPPKVKSEFNS YSLTGYV 


5575 


456 


766 


LLWAIiPCP PPTAAAVLiiS STGLMELLE KMLALTUUCADSPRTAL 
LCSAWIJjTASFSAQQHKGSIiQKDPLLSQACVGCIiEAI^iDYIJDAR 
S PDIGRNS PHYLMFP 


5576 

• 


249 


2146 


RSWGAPWFWRMRIJJUeRHMPLRLAMVGCAFVLFLFLbHRDVSSR 
EEATEKP WLKSL VSRKDHVI^LMLEAMNNLRDSMPKLQIRAP EA 
QQTLFS INQSCL PG FYT PAEL KPFWER P PQD PNAPG ADG KAFQK 
S KWTPLETQEKEEGYKKHCFNAFASDRI SLQRSLGPDTRPPECV 
DQKFRRCPPLATTSVI IVFHNEAWSTLLRTVYS VIiHTTPAILLK 
EI ILVDDASTEEHIiKEKLEQYVKQIiQVVRVVRQEERKGLITARL 
LGASVAQAEVLT FLDAHCECFHGWLEP LLAR I ABDKTVVVSPDI 
VTIDLNTFEFAKPVQRGRVHSRGNFDWSLTFGWETX.PPHEKQRR 
KDETYPI KSPTFAGGLFS I S KSYFEHI GTYDNQME I WGGENVEM 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 

etuixno dCXU 

residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
[A=Alanine, C=Cysteine, D=Aspartic Acid. E= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K= Lysine , 
L=Leucine, M=Wethionine, NsAsparagine, 
P— p>-r>l S n=» 0=Glutan»ine . R— Arcrinine, 
S -Serine, T=Threonine , V=Valine, 
W-Trypt ophan, Y^Tyrosine, X=Unknown, *=Stop 
Codon, /=pcssible nucleotide deletion, 
\=possible nucleotide insertion) 








SFRVWQCGGQLE1 I PCS WGHVFRTKSPHTFPKGTSVIARNQVR 
LAEVWMDS YKKI FYRRMLQAAKMAQEKS FGDISERLQLREQLHC 
HNFSWYIJINVYPEllFVPDLTPTFYGAIKNIXSTNQCI^ 
GKPLIMY S CHGIXSGNQYFE^TTQRDIjRHN I AKQLCIjHVSKGAIjG 
IX3SCHFTGKNSQVPKDEEWEI*AQDQLIRNSGSCTCI>TSQDKKPA 


5577 


3 


1275 


RNSDCSCGSISVHCLPWVLFILDLK\raSSMFCPLKL I LLP VLLD 
YSI^I^DLKTVSPPELTVHVGDSAI^IGCVFQSTEDKCI FKIDWTL 
S PGEHAKDE YVLYYYSNLS VP I GRFQNR VHLMGD I LCNDG SLLL 
QDVQEADQGTYI CEIRLKGESQ VFKKA WLHVL PEEP KELMVHV 
GGLIO^GCVFQSTEVKHVTKVEWIFSGRRAICEEIVFRYYHKLRM 
e tn? vcn e tii^ti c*nrjp vktt A/rsn T "FT? KTX^fJ T MIjOG VRSS DGGNYTCS 
IHLGNLVFKKTIVLHVSPEEPRTLVTPAALRPLVLGGNQLVI IV 

G IVCAT I LLLP VL 1 LI VKKTCGNKS S VNSTVLVKNT KKTNPE I K 
EKPCHPERCEGEKHIYSPIIVREVIEEEEPSEKSEATYMTMKPV 

WPSLRSDRNNSLEKKSGGGMPKTQQAF 


5578 


3 


783 


AVESMASPGAGRAPPELPERNCGYREVEYWDQRYQGAADSAPYD 
WFGDFSSFRALLEPELRPETiRILVLGCGNSALSYELFLGGFPNV 
TSVDYSSVVVAAMQARYAHVPQLRWE'TMDVRKLDFPSAS FDWL 
EKGTIJDALLAGERJDPWTVSSEGVHTVDQVLSEVSRVLV PGGRFI 
SMTSAAPHFRTRHYAQAYYGWSLRHATYGSGFHFHLYLMHKGGK 
LS V AQ t .ALGAQ ILSPPRPPTSPCFLQDS DHED FLS A I QL 


5579 


3 


1540 

• 


RNSGLARGAS ALARHGGGLAGGVGWDCGACAS RCQG VMEGLLTR 
PP ATtPAT ATr^SRQTjSQYVPCRFHHCAiPKKCaKK )jjj.ltoKV J? U^U^A* 
REDR VLS LQDKS DDI/T C KS QRLMLQ VGL I Y PAS PG CYHLLP YTV 
RAME KLVRVI DQEMQA1GGQKVNMPSLS PAELWQATNRWDLMG K 
ELLRLRDRHGKE YCLG PTHEEAJC TALIASQKKLS YKQLP FLLYQ 
VTRKFRDEPRPRFGLLRGREFYMKDMYTFDSSPEAAQQTYSLVC 
D AYCSLFNKLGIiP FVKVQAD VGT IGGTVSHE FQLPVD I GEDRLA 
I CPRCSFSANMETLDLSQMNCPACQGPLTKTKGIEVGHTFYLGT 
KYSS IFNAQFTNVCGKPTIAEKGCYGIjGVTRZIAAAIEVLSTED 
CVRWPSLLAPYQACLI PPX2CGSKEQAASEL I GQL YDH I TEAVFQ 
LIIGEVLLDDRTHLTIGNRLKDANKFGYPFVI IAGKRALEDPAHF 


5580 


1681 


450 


" ADAGTRCIPGFVVPSGAGYSAPAQRGRRSSGRMRAAAAPGLTAP 
WRLLQCCELEAGELGMAVPAAAMGPSALGQSGPGSMAPWCSVSS 

G PSRYVLGMQEL FRGHS KTREFLAHSAKVHSVAWS CEGPJRIiASG 
SFDKTASVFLLEKDRLVKENNYRGHGDSVLQLCWHPSNPDLFVT 

ASGDKTIRI WD VRTTKC IATVNTKGBN INI CW S PDGQTIAVGNK 
DDVVTFIDAKTHRSKAEEQFKFEVNE I SWNNDNNMFFLTNGNGC 
INILS YPELKPVOS INAHPSNCXC I KFDPMGKYFATGSADALVS 
LWDVDELVCVR CFSRLDWPVRTLS FSHDGKMLASASEDH F I DI A 
EVETGDKLWE VQCESPT FTVAWHPKRPLLAFACDD KDGKVBS S R 
EAGTVKLPGLPHDS 


5581 


54 


947 


" GGGSGPRAPSATLLDTGESVAAVASGEDKGIAASAAAAAVFACS 
CSPDPQSSTMNPVYSPVQPGAPYGNPron^YTGYPTAYPAAAPA 
YNPSLYPTNSPSYAPEFQFLHSAYATLLMKQAWPQNSSSCGT2G 
TFHLP VDTGTENRT YQAS S AAFRYTAGTP YKVP P TQSNTAP P P Y 
SPSPUP YQTAM YP I RS A YPQQNL YAQGAY YTQP VYAAQPH VIHH 
TTWQPNS IPS AI YPAPVAAPRTNGVAMGMVAGTTMAMS AGTLL 
TTPQHTAIGAHPVSMPTYRAQGTPAYSYVPPHW 


5582 


5775 


2739 


" I ITNNNNVI IPLVIAYHLSGSAQARGERS PAERLMERQKRKADI 
EKGLQFI QSTLPLKQEEYEAFIJjKLVQNLFAEGNDL FREKDYKQ 
ALVQ YMEGLNVADYAASDQ VAL PRELLCKLHVNRAAC Y FTMGL Y 
EKALEDSEKAJU3LDSBSIRALFRKARALNELGRHKEAYECSSRC 
S IjALPHDESVTQLGQELAQKLGLRVRKAYKRPQELETFSLLSNG 
TAAGVADQGTSNGLGS IDD I ETDCYVD PRGSPALLPSTPTMPLF 
PH\^DLlAPLDSSRTLPSTDSLDDFSIX3DVFGPEl^TLXiDSLSL 
VQGGLSGSGVPSELPQLIPVFPGGTPLLPPWGGS I PVSSPLPP 
ASFGLVMDPSKKLAASVLDALDPPGPTLDPLDLLPYSETRLDAL 
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SEQ 
ID 
NO: 


| Predicted 
I beginning 
1 nucleotide 
[ location 

corresponding 

to first 
1 amino acid 
1 residue of 
j amino acid 
1 sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Anu.no acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D^Aspartic Acid, E= 
Glutamic Acid, F« Phenyl alanine, G=Glycine, 
H^Hlstidine, I^Isoleucine, K=Lysine, 
L= Leucine, M=Methionine, N^Asparagine , 
P=Proline, Q=Glut amine, R=Arginine, 
S=Serine, T^Thr eonine , V= Valine, 
W»Tryptophan, Y=Tyrosine, X= Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 




1 ■* 




DSFGSTRGSLDKPDSFMSETOSQDHRPPSGAQKPAPSPEPCMPN 
TAUjI KNPIiAATHE FTGQACQLCYPKTGPRAGDYTYREGLEHKCK 
IU5II»I/3RI^SEDQTWKRIRPRPTKTSFVGSY^CKI)MINKQIX: 
KYCTNCTFAyHQEEIDVWTEERKGTLNR15IjLFDPIiGGVKRGSI»T 
IAKLLKBHQGI FTFLCEI CFDSKPRI IS KGTKDSPS VCSNLAAX 
HSPY^KCLVHIVRSTSIiKYSKIRQFQEHFOFDVCRHEVRYGCL 
REDS CHFAHSFIELKVWLLQQYSGMTH EDI VQSSKKYWQQMEAH 
AGJCASSSMGAPRTHGPSTFDLC^KFVCGQCWRNGCVVEPDKDLK 
YCSAKARHCWTKERRVliLVMS KAKRKWVS VRPLPS I RNFPQQ YD 
IiCIHAQNGRKCQYVGNCSFAHSPEERDMWTFMXENKILDMQQTY 
DMWLKKHNPGKPGEGTP I SSREGBKQIQMPTDYADI MMGYHCWI* 
OGKNSWSKKQWQQHIQSEKHKEKVFT3DSDASGWAFRFPMGEFR 
LCDRLQKGKACPDGDKCRCAHGQEEIjNEWLDRREVI^QKLAKAR 
KD MLIjCPRDDD FG KYNFLIjQEIXSD IaAGATP E AP AAAAT ATTGE 


5583 


3 


1265 


SSGCRQGRPGRSDRPRPPPRRHKMVKETR YYDILGVKPSASPEE " 
IKKAYRKIJu^KYHPDKlJPDEGEKFKLISQAYEVI*SDPKKRDVYD 
QGGEQAI KEGGSGS PS FSS PMDI FDMFFGGGGRMARERRGKNW 
HQLSVTIiEDLYNGVTKKLALQKNVI CEKCEGVGGKKGS VE KCPL 
CKGRGMHIHIOQIGPGMVQQIQTVCIECKGQGERINPKDRCESC 
SGAKVIREKKI I EVHVEKGMKDGQKILFHGEGDQEPELEPGDVI 

XViajQKDHSVFQRRGHDLIMKMKIQI^E^CGFKKTlKTLDNRI 
LVITSKAGEVIKHGDIiRCVRDEGMPIYKAPLEKGIIjI IQFT.VTF 
PEKHWLSLEKLPQLEALLPPRQKVRITDDMDQVELKEFCPNEQN 
WRQHREAYEEDEDGPQAGVQCQTA 


5584 


3 1 


1265 

9 


SSGCRQGRPGRSDRPRPPPRRHKMVKETRYYDI lgvkpsaspee 

ikkayrkiju,kyhpdkkpdegekfkx.isqayevlsdpkkrdvyd 
qggeqaikeggsgspsfss pmd i fdmffggggrmarerrgknw 
1 hqdsvtlf^l yngvtkklalqknvi cekcegvggkkgsvekcpl* 
ckgrgmhihiqqigpgmvqqi qtvcieckgqger inpkdrcesc 
sgakvirekkiievhvekgmkdgqk i lfhgegdqepelepgdvt 
i vldq kdhsv fqrrghdl i mkmki qls ealcgfkktiktldnri 
lvi ts kagevikhgdlrcvrdegmp i ykaplekg 1 1*1 iqflvi f 
pekhwlsleklpqleallpprqkvritddmdqvelkefcpneqn 
wrqhreayeededgpqagvqcqta 


5585 | 


2619 


915 


LPAGTPESSLHEAIiDQCMTADDLFI*TNQFSEAIiSYLKPRTKESM 
YHS I/TYATILEMQ AMMTFDPQDI IiIiAGNMMKEAQ'M LCQRHRRKS 
S VXDS FSSLVNRPTLGQFTEEEIHAEVCYAXr r J >QRAALTFXQD 
ENMVS F I KGG I KVRNS YQT YKEL.DSLVQSSQYCKGENH PH FEGG 
VKLG VGAFMLTLSML PTR I JLRI..LE FVG FSGNKD YGLLQLEEGAS 
GHS FRS VIjCVMLLLCYHTFLTFVIjGTGNVNI EEAE kllkpylnr 
YPKGAI FLFIiAGRI EVIKGNIDAA I RRFEECCEAQQHWKQFHHM 
CYWELiMW CFTYKGQW KMS Y F Y ADIjLS KENCWSKATYI YMKAAYL 

smfgkedhkpfgddevfxfravpglklkiagkslptekfairks 

RRYFSSNPISXPVPALEMMYIWNGYAVIGKQPKIjTDGILEIITX 

aeejo.ekgpeneysvddeclvicllkglclkyi^rvqeaee^ 
isanekkikydhyli pnallelalllmeqdrneeai kllesakq 
nyknysmesrthfriqaatlqaksslbnssrsmvssvsl 


5586 


2619 


915 | 


LPAGTPESSLHEALDQCMTALDLFLTNQFSEALSYLKPRTKESM 
YHSLT YAT 1 LEMQAMMTFDPQD I LLAGNMMKEAQMLCQ RHRRKS 

SVTDS FS slvnrptlgqfteee ihaevcyakcllqraaltflqd 

ENMVS FIKGGIKVRNS YQTYKELDSLVQS SQ YCKGENHPHFEGG 

VKLGVGAFNLTLSMLPTRILRLLE FVGFSGNKDYGLLQLEEGAS [ 

GHS FRS VI^CVMLLLC YHTFLTFVLGTGNVN 1 EE^ 

YPKGAI FLFIJ^R I EVIKGNIDAAI RRFEECCEAQQHWKQFHHM 

CYWELMWCFT YKGQWKMS YFYADLLSKENCWS KAT Y I YMKAAYL 

SMFGKEDHKPFGDDEVELFRAVPGLKLKIAGKSLPTEKFAIRKS 

rryfssnp i slpvpal emmy i wng yavi gkqpkltdg ileiitk 
aeemlekgpeneysvddeclvkllkglclkyi/;rvqeaeenfrs 
isanekkikydhylipnaij^erju,llmeqdrneeaiia.lesakq 

NYXNYSMESRTHFRlQAATLQAKSSLENSSRSMVSSVSL 


5587 


1768 


148 j SSAVPDGAVGRPVAVAVGGPPHS CRCRP CCX.MAAI G VHLG CTSA || 
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SEQ r 

ID 1 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end | 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino' acid segment containing signal peptide 
(A= Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G«Glycine, 
H=Histiciine, i^isoleucine. K=Lysine, 
^Leucine, M=Methionine, N=Asparagine, 
P-Proline, Q=Glutamine, R=Arginine r 
S=Serine, T -Threonine, V» Valine. 
W=Tryptophan, Y=Tyrosine, X=Unknovn, *=Sto? 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 






• 


CVAVYKIX3 RAG VVAND AGDR VTPAVVAYS ENE B I VGLAAKQS RI 
RNISNTVMKVKQI LGRSSSDPQAQKYIAESKCLVIBKNGKLRYE 
IDTGEETKFVNPEDVARLIFS KMKETAHSVLGSDANDWITVPP 
DFGBKQKNAIX;EAARAAGFlJVLRLIHB?SAAiLAYGIGQDS PTG 
KSNILVFKDGGTSI^I^VMEVNSGIYRVX»STNTDDNIGGAHFTE 
TIAQ YliAS EFQRS FKHDVRGNARAMMKI*TNSAEVAKHSLSTLGS 
ANCFI*DSLYEGQDFDCNVSRARFEIiIiCS PL.FNKCI EAIRGIaLDQ 
NGFTADDINKWLCGGSSRIPKLQQLIKDIiPPAVELLNSIPPDE 
VIPIGAAIEAGlLIGKEmLVEDSLMlECSARDILVKGVDESGA 
SRFTVLFPSGTPLPARRQHTLQAPGSISSVCLELYESDGKNSAK 
EETKFAQ WIjQDLDKKENG LRDIIAVI/TMKRDG S LHVTCTDQET 
GKCEAISIEIAS 


S5B8 


3 


589 


TPPPPEQAMVAATVAAAWLIO.WAAACAQQEQDFYD?KAVN 1 RGK 
LVSLEKYRGSVSLVVNVASECX3FTDQH 

VLAFPCNQFGQQEPDSNKEIESFARRTYSVSFPMFSKIAVTGTG 
AHPAFKYLAQT SGKEPTWNFWKYLVAPDG KWGAWD PTVSVEEV 
RPQITALVRKL J LLKREDI, 


5589 


1884 




IJ*OAWHEGGIGOTDKERGAAALPGEEGDPTRGRSLGRASWESGS 
PRRPRSP FSSFIiPRPICO^SLEARPCS 1 EDRRNWSliIGRPGAPAS 
GLNRSSGLWI^PDRCRPR^RCSCRVMENPSPAAAl^KALCALLL 
ATIJGAAGQPLGGESICSARAPAKYSITFTGKWSQTAFPKQYPLF 
RPPAQVJSShZjGRAHSSDYS MWRKNQYVSNGLRDFAERGEAWAIjM 
KEIEAAGEAIjQSVHAVFSAPAVPSGTGQTSAELEVQRRHSLVS? 
WRXVPS PDWFVGVDSLDLCDGDRWREQAALDLYPYDAGTDSGP 
TFSSPNFATIPODTVTE ITSSSPSHPANS FYYPRLKAL.PPIARV 
TLLPiRQSPPAFlPPAPVLPSRDNEIVDSASVPETPLDCEVSliW 
SSWGLCGGHCGRLGTKSRTRYVRVQPANNGSPCPELEEEAECVp 

DNCV 


5590 


72 


896 


" IaCSSGALRLjIjP AMVAWRSAFL VCZLAFS LxATL»VQRGSGDFDD FN1* 

ei^vketssvkqpwdhttttttnrpgttrapakppgsgldlada 
lddqdix;rrk^iggrerwnhvttttkrpvttp^antlgndfd 
ladalddrndredgrrkpiaggggfsdkdledxvgggeykpdkg 
kgdgr ygsnddpg sgmvaepgtiag vasalamaii i gavssy i sy 

QQKKFCFS I QQGLNAD YVKGENIiEA WCE E P QV K YST LHTQ S AE 
PPPPPEPARI 


5591 


1 CO 
1 


1494 


AGSSRKAAABRLliVSAGCRSIAGRASGVl.bLPAEIiLPGEEEAMA 
LRVTRNS KINAENKAKINMAGAKRVPTAPAATS KPGLRPRTALG 
DIGNKVS EQLQAKMPMKKEAKPS ATGKVI DKKLPKPLEKVPMLV 
PVPVSEPVPEPEPEPEPEPVKEEKLSPEPIUVDTASPSPMETSG 

CAPAEEDXi CQAFSD V I LAVNDVDAEDGAD PNLrCS B YVKD I YA YL 
RQLEEEQAVRPKYLIiGREVTCNMRAIIilDWLVQVQMKFRLLQET 
MYMTVSI IDRFMQNNCVPKKMLQIiVGVTAMFIASKYEEMYPPEI 
GDFAFVTDNTYTKHQIRQMEMKI LRALNFGIX3RPI*PLHFLRRAS 
KIGEVDVEQHTLAKYLMFXTMLDYBMVHFPPSQIAAGAFCLALK 
I LDNGEWTPTLQHYLS YTEESLLPVMQHIJUCMAAMVNQGLTKHM 
TVKNKYATS KHAKI STL PQlaNSAIiVQDLAKAVAKV 


5592 


242 


924 


YGES KDWNQKJDU^S ALVLTrVNCLPTP I MAKSAEV KLAI FGRAG 
VGKSAIrVVRFLTKR F I WE YDPTLES TYRHQATI DDEWSMEILD 
TAGQEDTIQREGHMRWGEGFVLVYDITDRGSFEEVliPLKNILDE 

I KKP KNVTLr I»VGNKADI*DHS RQVSTEEGEKIiATELACAFYECS 
ACTGEGN I TE I FYELCREVRRRR^QGKTRPJIS STTHV KQAllv K 
MLTKISS 


5593 


3 


1113 


" HA5GGRAANMAAERGAGQQQSQEMMEVDRRVESEIESGDEEGKKH 
S SG I VADLS EQSLKDGEERGEEDPEEEHELP VDMET INIiDRDAE 
DVDLmmilGKIEGFEVljKKVKTI^RQOTjIKCIEWIjEELQSLR 
BLDIi YDNQ I KKI ENLEALTELEILDI S FNLLRN I EGVD KLTRLK 
KLFLVNNKI S KI ENI*SNI*HQLQMIjEI«GSNRI RAIENI DTLTNLB 
SLFI^KNKITiO^NLDAl.T^TVI^MQSNRLTKJrEGI^NLVNLR 

ELYLS HNG I EVIEGt»ENNNKLTMI*DI ASNRI KKIENI SHLTELQ 
EFWMNDNI^ESWSDLDELKGARSLETVYI>ERNPIiQKDP<>YRRKV 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corr e spoiidir.g 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A- Alanine, c=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F»Phenylalanine, G=Glycine, 
H=Histidine, I = Isoleucine, K= Lysine, 
L= Leucine , M=Methionine, N=Asparagine , 
P= Proline, Q=Glutamine, R=Arginine, 
S=Scrine, T» Threonine , V=Valine, 
W= Tryptophan, Y=Tyrosine, X=Unknown, *=Stop ! 
Codon, /^possible nucleotide deletion, j 
\-possible nucleotide insertion) j 








MLALPS VRQ IDATFVRF | 


5594 


3 


1113 


HASGGRAANMAAERGAGQQ^SQEfWEVDRRVESEESGDEEGKKH 
SS G I VADLS E QS LKDGEERGE ED PEESHELP VDMET I NLDRDAE ! 
DVDLNHYR IGKI EGFEVLKXVKTLCLHQNLI KCI ENLEELQSLR | 
ELDL YDNQ I KK I ENLRALTELE I LD I S FNLLRN I EGVDKLTRLK 
KLFL VNNKI S Kl SNLSNLHQLQMLE LG SNR IRA 1 2NI DTLTNLE 
SLFT^KNKITKLQNIJIALTNLTVLSM^ 

ELVI»SHNG1EVIEGLEN1^KIjTMLDXASNRIKKI2N1SHLTELO 
EFWHNDNXiLESVJSDIJDELKGARSLETVYLERNPI^KDPQYT^ 
MLALPS VRQI DATFVRP | 


5595 


3 


1476 

- 


ARWNG RW VQV PAW PG PG CGTNASGERQRQLPRAWRP VGRTLGSE 
PIALAWSPPLY1JPIPLPSWAVSQPTPTLGTMFADLDYDIEEDK 
LG I PT VFGKVTLQ. KDAQNL I G I S I GGGAQ YCPCLYI VQ VFDNTP 
AALDGTVAAGDE I TGVNGRS I KGKTKVEVAKM I QEVKGEVT I HY 
NKLQAD P KQGMS LDI VLKKVKHRLVENMS SGTADALGLS RAI LC 
NIX3LVKRI^LERTAELYKGMTEHTKNIJ J RAPYEl»S0^TmAFGD 
VFSVIGVREPQPAASEAFVXFADAHRS XEKFGIRLLKT I KPMLT 
DLNTYLNKAI PDTRLTIKKYLDVKFEYLSYCLKVKEMDDEEYSC 

IALGEPLYRVSTGNYEYRLrLRCRQEARARFSQMRKDVLEKMSL 
1WKHVQDIVFQLQRLVSTMSKYYNDCYAVLRDADVFPI EVDLA 

HTTIAYGI^QEEFTDGEEEEEEEDTAAGEPSRDTRGAAGPLDKC 
GSWCDS j 


559S 


698 


219 


GAVLAPSSLPAAELAAQGES QSLEDLSNTSRPTSE VYXI S F I FP 
NGDKYDGDCTR'TSSGI YERNGIGIHTTPNG I VYTGSWKDDKMNG 
FQRLEHFSGAVYEGQFIO^NMFHGLGTYTFPNGAKYTGNFNENRV 
KGEGEYTHIQGTRMDWTFHFTSCSQT [ 


5597 


3 


731 


I SCKMAADGQS SLPASWRS VTLTHVE YPAGDLSGHLLAYLSLS P 
VFVIVGFVTLIIFKRELHTISFIiGGI^AIaNEGVNWLIKNVIQEPR 
PCGGPHTAVGTKYGMPS SHSQ FMWFFS VYS FLFLYLRMHQTNNA 1 
RFLDLLW RHVLSLGLLAVAFLVSYS R VYLLYHTWSQVLYGG IAG 
GU1AIAWFIFTQEVLTPLFPRIAAWPVSEFFLIROTSLIPNVLW 
FEYTVTRAEARNRQRKLGTKIiQ 


5598 


326 


2440 


GIGPIAAS FIFCKVASLYI FLSPPPPS VSG VPYS PANSS WS CAL 
VPLLX3SGVPPHPPAPSPCCSGC?n^KMLSFKLlJjLAVALGFFEG 
DAKFGERNEGSGARRRRCLNGNP PKRLKRRDRRMMS QLELLS GG 
EMLCGGF YPRLS CCLRSDS PGLGRLENKI FSVTNNTECGKLLEE 
I KCALCS PHSQS LFHS PEREVLERDLVLPLLCKDYC KEFFYTCR 
GHIPGFLQTTADEFCFYYARKDGGLCFPDFPRKQVRGPASNYLD 
QMBEYDKVEEI SRKHKHNCFCIQEVVSGLRQPVGALHSGDGSQR jj 
LFIUBKEGYVKILTPEGEIFKEPYLD1HKLVQSGIKGGDERGLL 
SLAFH PNYKKNG KL Y VS YTTNQERVZAI G P HDHILRWE YTVS RK 
NPHQVDLRTARVFLEVAELHRKHLGGOLLFGPDGFLYI I LGDGM 
I TLDDME EMDGLS D FTGS VIiRLD VDTDM CNVP YS I PRSNPH FNS 1 
TNQPPEVFAHGLHDPGRCAVDRHPTDININLTILCSDSNGKNRS 
SARILQI IKGKDYESEPSLLEFKPFSNGPLVGGFVYRGCQSERli 
YGSYVFGDRNG^FLTLQQSPVTKQWQBKPLCLGTSGSCRGYFSG j 
HII/3FGEDEU3EVYILSS^KSMTQTHNGKLYKIVDPK31PLMPEE j 
CRATVQPAQTLTSECSRLCRNGYCTPTGKCCCSPGV7EGDFCRTG j 


5599 


326 


2440 


G IGPI AAS FI FC KVASLYI FLS PPPPS VSGVPYSPANSSWSCAli 
VPLDGSGV P PHP PAPS PCCSGQTMLKMLSFKUjLLAVALGFFEG 
DAKFGERNEGSGARRRRCLNGNP PKRLKRRDRHMMSQLELLSGG 1 
E^CGG FYPRI^ CCLRSDS PGl^RI^K I FSVTN^ECGKLLEE 
I KCALCS PHSQSLFHS PERBVLERDLVLPLLCKDYCKEFFYTCR 
GHIPGFLQTTADEFCFYYARKDGGLCFPDFPRKQVRGPASNYLD I 
QMEEYDKVEE ISRKHKHNCFCI QEVVSGLRQPVGALHSGDGSQR 
LFILEKEGYVKILTPEGEIFKEPYLDIHKLVQSGIKGGDERGLL 
S LAFHPN Y KKNGKLYVS YTTNQERWAIGPHDH I LRWE YTVSRK 
NPHQVDLRTARVFLEVAELHRKHLGGQLLFGPDGFLYI ILGDGH 
ITLDDMEEMDGLSDFTGSVLRLDVDTDMCNVPYSIPRSNPHFNS 
TNQPPEVFAHGIJiDPGRCAVDRHPTDININLTILCSDSNGKNRS | 
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SEQ 
ID 
NO: 



5600 



5501 



5602 



5603 



5604 



Predicted 
beginning 
nucleotide 
location 



nucleotide 

location 

corresponding 



corresponding I to first 



to first 
amino acid 
residue of 
amino acid 
sequence 



amino acid 
re s idue of 
acid 



anu.no 



sequence 



1977 



1244 



1977 



1244 



246 



766 



565 



1506 



5605 



35 



1821 



5606 



1099 



Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F« Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K» Lysine, 
L= Leucine, M=Methionine, N=Asparagine , 
P= Proline, Q=Glut amine, R=Arginine, 
S= Serine, T=Threonine, V=Valine, 
W=Tryptophan , Y-Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 

\=possible nu cleotide insertion) 

LI LQI IKGKD YESEP S LLE FKP FSNGP L VGGFVYRG CQS ERL 
YGSWFGDimGNFLTLQQSPVTKQWQEKPLCLGTSGS CRGYFSG 
HIU5FGEI)EI^EWI1£SSKSOT<^HNGKLYKIVDPKRPLMPEE 
CRATVQPAQTIjTS ECSRLCRNGYCTPTGKCCCS PGWEGDFCRTG 



SLRVLSGHLMQTRDLVQPDKPASPKFIVTLDGVPSPPGYMSDQE 
EDMCFEGMKPVNQTAASNKGLRGLLHPQQLHLLSRQLEDPNGSF 
SNAEMSELSVAQKPEKLLERCKYVJPACKNGDECAYHHP I S PCKA 
FPNCKFAEKCLFVHPNCKYDAKCTKPDCPFTHVSRRI PVLS PKP 
AVAPPAPPSSSQLCRYFPACKKWECPFYHPKHCRFNTOCTRPDC 

TFYHPTINVPPRHALKWIRPQTSE _ 

SLRVLSGHLMQTRDLVQPDKPASPKFIVTLDGVPSPPGVMSDQE 
EDMCFEGMKPVNQTAASNKGLRGLLHPQQLHLLSRQLEDPNGSF 
SNAEMSEI»SVAQKP EKLLERCKYW PACKNGDECAYHHP I S PCKA 
FPNCKFAEKCLFVHPNCKYDAKCTKPDCPFTHVSRRI PVLS PKP 
AVAPPAPPSSSQLCRYFPACKKMECPFYHPKHCRFNTQCTRPDC 

TFYHPT I NVPPRHALKW I R PQTSB 



YHTS CT^/MPTa^B Z k T .RK rTEVPVGCLM V YNNEWGKGRNE VNQTK 
NATRHAEMVAIDQVLDW CRQSGKS PSE V FEHT V LYVTVEPC IMC 
AAALRLMK1 PLV\T^ GCQNER FGGCGSVLNXAS ADLPNTGRP FQC 
I PGYRAEEAVEMLKTFYKQENPNAPKSKVRKKECQQILNMF 

FRGR T P I S GGERG CAQY P I P ATP ARS G ENRTM P GAGDGG KAPAR 
WLGTGLLGLFLLPVTLSLEVSVGKATDIYAVNGTEILLPCTFSS 
CrcFEDLHFRWTYNSSDAFKILIEGTVKNEKSDPKVTLKDDDRI 
TLVGSTKEKRNNIS IVLRDLEFSDTGKYTCHVKNPKENNLQHHA 
TIFLQWDRRMQ 



EDIF PAQLLKLQRHERVWQQE P P VRDHRS WGGSGAGG VAGRE WT 
IX3GQVA1jGGH YMAEGEGY FAMS EDELACS P YI P LGGD FGGGD FG 
GGDFGGGDFGGGD FGGGGS FGGHCLDYCESPTAHCNVLNWEQVQ 
RLDGILSETIPIHGRGNFPTLEI^PSLIVKVVRRRLAEKRIGVR 

DVRLNGSAASHVLHQDSGLGYKDLDLI FCADLRGEGEFQTVKDV 
VLDCLLDFLPEGWKEKITPLTLKEAYVQKMVKVCNDSDRWSLI 
SLSNNSGKNVELKFVDSLRRQFEFSVDSFQIKLDSLLLFYECSE 
NPMTETFHPTI IGESVYGDFQEAFDHLCNKI I ATRNPEEIRGGG 
LLKYCNLLVRGFR P ASDE I KTLQRYMC S RFF I DFS D I GEQQRKI* 
ESYIjQmiFVGLEDRKYEYIJvlTLHGVVNESTVCLMGHERRQTLNL 
ITMLAIR VLADQNVI PNVANVTCYYQPAPYVADANFSNYYIAQV 

QP VFTCQQQTYSTWLPCN 

SQRSCPRSPSSPAPPWARCSNPDSRTGGVPVPRAWSAGGPALGL 
MAAPVRLGRKRPLPACPNPLFVRWLTEWRDEATRSRHRTRFVFQ 
KALRSLRR YPLPLRSGKEAKILQH FGXX5LCRMLDERLQRHRTSG 
GDHAPDSPSGENSPAPQGRLAEVQDSSMPVPAQPKAGGSGSYWP 
ARHSGARVI LLVLYREHLNPNGHHFLTKEELLORCAQKS PRVAP 
GSARPWPALRSLLHRNLVI^THQPARYSLTPEGLELAQKliAESE 
GLSLLNVGXGPKEP PGEETAVPGAASAELASEAGVQQQPLEIjRP 
GEYRVLLCVDIGErTRGGGHRPElJjRELQRL^ 
VWVAQETNPRDPANPGELVLDHI VERKRLDDLCSSI IDGRFREQ 
KFRLKRCGLERiiVYLVEEHGSVHNLSLPESTIJX3AVTNTQVIDG 
FFVKRTADI KESAAYLALLTRGLQRLYQGHTLRSRPWGTPGNPE 
SGAMT3 PN PLCSLLTFS DFNAGAI KNKAQS VREVFARQLMQVRG 
VSGE KAAALVDR YSTPAS LXiAAYDACATPKEQETLLSTI KCGRL 
QRNLG PALS RTLSQLYCS YGPLT 



l3RSRC?GPGARGGTMSPRSCLRSLRLLVFAVFSAAASNWLYLAK 
LS S VGS I SEEETCEKLKGLI QRQVQMCKR15LE VMDS VRRGAQLA 
IEECQYQFRNRRWNCSTLDSLPWGKVVTG^TREAAFVYAI SSA 
GVAFAVTRACS SGELEKCGCDRTVHG VS PQG FQWSGCS DN I AYG 
VAFS Q S FVD VRERS KGAS S S RALMNL HNNEAGRKAI LTHMR VE C 
KCHG VS G S CEV KTCW RAVP P FRQVGHALKEKFDGATE VE P RR VG 
SS RAL V? RNAQ FKPHTDED L VYLE P S PDFCEQDMRSGVLGTRGR 
TCNKTS KAXDG CEX.LCCGRGFHTAQVELAERCS CKFHWCC FVKC 
RQCQRLVELHTCR 
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"SEQ | Predicted 
ID I beginning 
NO: I nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



Predicted end j Amino acid segment containing signal peptii 



5607 



521 



nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



141 



5608 



983 



5609 



1628 



304 



5610 



S4 



1196 



5611 



577 



5612 



721 



5613 



115 



1279 



5614 



1268 



(A=Alanine, C- Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
I>=I*eucine , M=Me thionine , N=sAsparagine , 
P=Proline, O^Glutamine, R=Arginine, 
S=Serine, T=Threonine , V= Valine, 
W=Tryptophan / Y= Tyro sine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide inser t ion) 
PPVCNPAEAMPSPGTVCSLU,U^MI»WLDLAMAGSSFLSPEHQRV 
QQRKESKKPPAKLQPIU^IaAGWLRPRIXjGQAEGAEDELEVRFNAP 
FDVGIKLSGVQYQQHSQALGKFLODILWBEAKEAPADK 



WFQSPIjRQADPGPPRHTLFMDFVAGAIGGVCGDAVG Y PLDTVKV 
RIQTEn?KYTGIWHCVllDTYHRBRWGPYRGLl^PVCTVSLVSSE 
VFGTYRHCLAH I CRLRFGNPDAKPTKAD I TLSGCASGliVR VFI/T 
SPTEVAKVRI^TQTQAQKQQRRLSASGPLAVPPMCPVPPACPEP 
KYRGPLHCLATVARJJEGLCGLYKGSSALVLRDGH S FATYFLS YA 
VLCE W LS PAG H S R P D VPG VL VAGG CAGVLA WA VATP MD VI KS R L 

QADGQGQRRYRGLLHCMVTIVREEG?RVIjFKGI*VtiNCCRAFPVN 
MWFVAYEAVLRI ARGLLT 

AKGVWVLPS PP PRPGRGALVSGSGLRRGRSGTS WRPRRMNHKS K 

KRIREAKRSARPEXKDSLDWTRHNYYESFSLSPAAVAJDNVERAD 

ALQLS VEE FVER YERP YKP WLLNAQEGVJS AQE KWTLERLXRKY 

RNQKFKCGEDNDGYSVKMKMKYYIEYMESTRDDSPJjYIFDSSYG 

EHPKRRKLLEDYKVPKFFTDDLFQYAGEKRRPPYRWFVMGPPRS 

GTGIHIDPLGTSAWNALVQGHKRWCLFPTSTPRELIKVTRDEGG 

NQQDEAITWFNV I YPRTQLPTWPPEFKPLE I LQKPGETVFVPGG 

WWIIVVIjNIJXrTIAITQNFASSTNFPVVWHKTVRGRPKI^RK^ 

II»KQEHPEIAVIiADSVDLQESTGIASDSSSDSSSSSSSSSSDSD 

SBCESGSEGDGTVHRRKKRRTCSMVGNGDTTSQ0DCVSKERSSS 
R 

LERTPASADMAWTKYQLFLAGLMLVTGS INTI>SAKWADNFMAEG 

CGGSKEr^FQHPFLQAVGMFI^FSClAAFYLLRCRAAGQSDSS 

VDPQQPFNPI^FLPPAI>CDMTGTSJjMYVA1jNMTSAS^ 

VI I ftgi*fsvaflgrri»vlsqwlg ILATIAGLVVVGLADLLSKH 

DSQHKLSETVITGDljLIIt^QIIVAIQMVLEEKFVYKHNVHPLRA 

VGTEGliFGFVIIiSljLIiVPMYYIPAGSFSGNPRGTliEDALDAFCQ 

VGQQPLI AVAL.LGNISS IAFFNFAGI SVTXELSATTRMVLDSLR 

TWIWAIiSI^GWEAFHALQII^FLIl^IGTALYNGOiRPLIiGR 

LSRGRPIAEESEQERLLGGTRTP1NDAS 



FVLPNRJjGI PGS TFRGPGACASSSS LAAS AKPGAGGS PALAMSG 
ELSNRFCGGKAFGLLKARQERRLAEINREFLCDQKYSDEENLPE 
KLTAFKEKYME FDLNNEGE I DLMSLKRI4KEKLGVPKTHLEMKKM 
ISEVTGGVSDTI S YRDFVWMMLGKRSAVIiKLVMMFEGKANESS P 
KPVGPPPERDIASLP 



ASRDGYMDATIAPHRI PPEMPQYGEENHI FELMQAT4WL.CKHLNS 
SIXTLENLIIjNEPSYTATEARRLYI<JRKTVPSAI>LVQLIQER1A 
EEDCI KQGWILDGI PETREQALRIQTLG I TPRHVI VLSAPDTVL 
IERI^KRIDPQTGEIYHTTFDWPPESEIQNRLMVPEDISEIjET 
AQKIxIiEYHRNIVRVI PSYPKI LKVISADQPCVDVFYQALTYVQS 
NHRTNAPFTPRVLIjLGPVGS 



RGVDPALRRAEKWLPIjSIKDDEYKPPKFNLFGKISGWFRSIIjSD 
KTSRKLFFFLCLNLSFAFVELLYGIWSNCLGLISDS fhmffdst 

ai laglaas viskwrdndafs yg yvraevlagfvnglfli ptaf 

FI FSEGVERALAP PDVHHERLLLVS IIjGFWNLiIG I fvfkhggh 

gr^hgsghghshslfngaldqahghvdhchs hevkhgaahshdh 
ahghghttiskxkspsijkettgpsrqn.cstvflhiiiadtlgsigvi 

ASAIKMQNFGLMlADPICSILIAIIilWSVI PliLRESVGILMQR 
TPPUjBNS LPQCTYQRVQQLQGV YS LQEQHFWTLCS D VYVGTLKL 
I VAPDADARWI LSQTHNI FTQAG VRQL YVQ I DFAAM 



LLSRNEHAC PLCAG LGIiTQRKP KAIRGREGRATNQG QGETQNE R 
APWGARQRLGVMAELQQLQEFEI PTGREALRGNHSALIiRVADYC 
EDNYVQATDKRKALEE TMA FTTQAIAS VAYQ VGNLAGHTLRMLD 
DQGAALRQVEARVSTIX^MVNMHMEKVARRE IGTLATVQRIiPPG 
QKVIAPENLPPLTPYCRRPLNFGCLDDI GHGI KDLSTQLSRTGT 
LSRKSI KAPATPASATIiGRPPR I PEPVHLPVVPDGRLSAASS AS 
SLASAGSAEGVGGAPTPKGQAAPPAPPLPSSLDPPPPPAAVEVF 
QRPPTLEELSPPPPDEELPLPLDLPPPPPLDGDELGI.PPPPPGF 
GPDEPSWVPAS YlaEKWTLYP YTSQKDNELS FSEGTVI CVTRRY 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide ' 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleot ide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, c=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenyl alanine, G=Glycine, 
HwHistidine, I=Isoleucine, K= Lysine , 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q^Glut amine, R=Arginine, 
S=Serine, T=Threonine f V=Valine, 
W= Tryptophan, Y=Tyrosine, X- Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\-possible nucleotide insertion) 








SDGHCEGVSSEGTGFFPGNYVEPSC 


5615 


9 


1558 

- 


ALGRRRPGDPREMEAAATPAAAGAARREELDMDVMRPLINEQNF 1 
DGTSDEEHEQEUjPVQKHYQLDDQEGI SFVQTIJ'IHIjIjKGNIGTG 
LLGLPLAIKNAGI VLG PISLVFIGI ISVHCMHILVRCSHFLCLR 
FKKSTLGYS DTVS FAMEVS PWSCLQKQAAWGRSVVD FFLVTTQL 
GFCSVYIVFIAENVKQVHEGFLESKVFISNSTNSSNPCERRSVD 
LR I YMLCFLP FI I LLVFIRELKNL FVLS FLANVSMAVS LVT I YQ 
YWRNKPDPHNLP I VAGWKKYPLFFGTAVFAFEGIGWLPIiENQ 
MKESKRFPQALNIGMG I VTTLYVTIATIiGYMCFHDE IKGSITI*N 
LPQDVWLYQ S VKILYS FGI FVTYS IQFYVPAE III PGI TSKFHT 
KWKQICEFG IRSFLVS ITCAGAI LI PRLDIVI SFVGAVSSSTLA 
L I L? P LVE I LTFS KEHYNI WMVLKN I S IA FTG WG FLLGTY I TV 
EE 1 1 YPTPKWAGTPQSPFLNLNSTCLTSGLK 


5616 


1 


719 


DD FVROG PQ S AAMGAS ARLLRAV IMGAPGSGKGTVSSR I TTH FE 
LKHI^SGDLLRD^liRGTEIGVIiAKAFIDGCKLIPDDVWTRLAL 
HELXNLTQ YS WLIiDGFPRTLPQAEALDRAYQ I DTVINLNVP FEV 
I KQRLTARWIHPASGRVYNIEFKPPKTVGIDDLTGEPLIQREDD 
KPETVIKRLKAYEDOTKPVLEYYQXXGVLETFSGTETNKIWPYV 

YAFLQTKVPQRSQKASVTP 


5617 


176 


765 


PWRGRGSRPRGAGAMAEEQVNRSAGLAPDCEASATAETTVSSVG 
TCEAAGKSPEPKDYDSTCVFCTRIAGRQDPGTELLHCENEDLICF 
KDI KPAATHHYLWPKKHI GNCRTLRKDQVE LVENMVTVGKTIL 
ERNNFTDFTNVRMGFHMPP PCS I SHLHLHVLAPVDQLGFLS KLV 
YRVNSYWFITADHLIEKLRT 


5618 

* 


3 


1692 

• 


YLNYINLiCSEJmjSGKEDLWEiO.QYI*WKSTIiNLPEDLLRVPDES 
LFLNSGGDSLKSI RLLSEI EKLVGTSVPGLLBI ILSSS ILEI YN 
HILQTVVPDEDVTFRKSCATKRKLSNINQEEASGTSIiHQKAI.yrr 
FTCHNEINAFVVLSRGSQILSLNSTRFLTKLGHCSSACPSDSVS 
QTNIQNLKGLNSPVLIGKSKDPSCVAKVSEEGKPAIGTQKMELH 
VRWRSDTGXCVDAS PLWI PTFDKSSTTVYIGSHSHRMKAVDFY 
SGKVKWEQI LGDR I ES SACVS KCGNF I WGCYNGL VYVLKS NSG 
EKYT^FTTEDAVKSSATMDPTTGLIYIGSHDQHAYALDIYRKKC 
VWKSKCGGTVFSS PCLNLI PHHLYFATLGGLLLAVNPATGNVIW 
KHSCGKPLFSS PQCCSQY I CIGCVIX3NLLCFTHFGEQVWQFSTS 
GPI FSSPCTSPSEQKI FFGSHDCFI YCCNMKGHLQWKFETTSRV 
YAT P FAFHNYNGSNEMLLAAASTDGKVW I LES QSGQLQSVYE LP 
GEVFS S PWLE SMLIIG CRDNYVYCLDLIX3GNQK 


5619 


2160 


1477 


DSPVLPTSGNVISTAQPAQPWSAVEAALRSLGSPPGAGRGCPCP 

AQSIiHSHQLAAWDPLKPSLRSYPPHLLQHPUIjRoIjl 

RSCPQPRPLEEUJlAGSSTRPQPLTSSCCGMSC3yiYSFLGHCSVL 

LWGTKGRGSGSPSSPGCCLHPPAQHSQDLPLVHVDVGWQPPLGP 

TVGLRPGLLGERQRGALRAGDPQCQCPLPATVREDLGVPSPWAA 

ECSPPATP 


5620 


930 


182 


PLPP PTLAMFLTRSE YDRGVNTFSPEGRLFQVEYAI EAI KLGST ■ 
AIGIQTS EGVCLAVEKRITS PLMEPS S I EKI VE I D AH I GCAMSG 
LIADAKTLIDKARVETQNHWFTYNKTMT Vhj^v Av v» 
EEDADPGAMSRPFGVALLFGGVDEKGPQI*FHMDPSGTFVQCDAR 
AIGSASEGAQSSLQEVYHKSMTLKEAI KSSLIILKQVMEEKLNA 
! TNIELATVQPGQNFHMFTKEELEEVIKDI 


, 5621 


J 




ENDLTWVLKHCERFLKQQO^SIKSSLLCXG^NYAGHDWFVSSI^ 
M IPILGDKEKTFQFIiHQFSRIiLTSAFLWLPRLHISSYl^iroTVES 
G IHPVYFCS THY I EMLLKAELPLVFS AFHMSGFAPSQ I CLQWI T 
QCFWNYLDWIEICHYIATCVFIjGPDYQVY I CIAVFKHLQQDILQ 
HTQTQDLQ VFLKEE ALHGFRVS DYFEYME I LEQN YRTVLLRDMR j 
NIRLQST 


5622 


1122 


456 


"aastki^vsrkrshsaseksgtgtsiskrlnmnpqirnpmkamy 
pgtfyfqfkm*weaot)r2tetwlcftvegi krrs wswktgvfrn 
QVDSETHCHAERCFLSW fcddilspntkyqvtwytsws pcpdca 

GEVAEFLARHSNVNLTI FTARLY YFQYP CYQEGLRS LS QEGVAV 
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SEQ 
ID 
NO: 


| Predicted 
1 beginning 
I nucleotide 
I location 
j corresponding 

to first 
I amino acid 

residue of 
1 amino acid 
1 sequence 


Predicted end 
nucleotide 
location 
corre spending 
to first 
amino acid 
residue of 
amino acid 
sequence 


1 Amino acid segment containing signal peptide 
i lAouamiie, Cscysceine, D=Asparcic Acid, E- 

Glutamic Acid, F- Phenyl alanine, G-Glycine, 
I H=Histidine, I«Isoleucine, K= Lysine, 

L=Leucine, M=Methionine, N=Asparagine , 

P= Proline, Q=Glut amine, R=Arginine, 

S=Serine, T= Threonine, V=Valine, 
j W= Tryp t ophan , Y- Tyrosine , X=Un3cnown , *=Stcp 

Codon, /-possible nucleotide deletion, 
1 ^^uaoiUAc iiucieoiiae lnscrcion } 








BIITOYEDFKYCWENFVYNDNEPFKPWKGLKTNFRIjLKRRLRESL 
Q 


5623 


3 


954 


FLP FF I RAPKI S RNGQWLFTFTTP FPFANKAL PGWEG I VPACFW 
RKKILTPSTGTMELLQVTILFLLPS ICSSNSTGVLEAANNSLVV 
TTTKPS ITTPNTESLQKNVVTPTTGTTPKGT ITNBIjLKMS LMST 
i flitbi a iujc^ JbisAi x 1 DVRKNDS I ISNVTVTS VTLPNAVSTLQS 
SKPKTETQSSI KTTEI PGSVLQPDASPSKTGTLTSI PVTIPENT 
SQSQVIGTEGGKNASTSATSRSYSS I ILPWI ALIVTTLSVFVL 
VGLYRMCWKADPGTPENGNDQPQSDKESVKLLTVKTISHESGEH 
SAQGKTKN 


5S24 


1 159 


898 


PGVAAAAGALPQYHGPAPALVSCRRELSL^AGSI^LERiOlRDFT 
SS GSRKLYFDTHAL.VCLLEDNGFATQQAE 1 1 VSALVKILEAN>TO 
IVYKDMVTKMQQEITFCXJVMSQIANVKK^ 

NEKIKLELHQLKQQVMDEVT KVRTDTKLDFNLEKSRVKBLYSLN 
I EKKLLELRTE IVAI4€AQQDRAliTQTORKIETEVAGI*KTMI*ES HK 
LDN I KYDAGS I FTCLTVALG FYRLW I | 


5625 


1 


1180 


1 TI PSS AAAQRAG PPAGALEALS PGGARAHAERRG EMRAT P LAAP 1 
AGSLSRKKRLELDDNIJDTERPVQKRARSGPQPRIiPPCLLPLSPP 
TAPDRATAVATASRLGP YVLLE PEEGGRAYQALHCPTGTE YTCR I 
VYPVQEAl^VLEPYARLPPHKHVARPTEVl^GTQLLYAFFTRTH 1 
GDMHSLVRSRHR I PEPEAAVbFRQMATALAHCHQHGnVIiRDLKL 
CRFVFADRERKKLVLEOTEDSCVLTGPDDSLWDKHACPAYVGPE 
ILSS RAS YSG KAADVWSI/JVAIiFTMJLAGH YPFQDS E PVLL FGKI I 
RRGA Y ALPAGLSAPARCLVRCLLRRE PAERLTATG I LLHP WLRQ 
DPMPIjAPTRSHIjWEAAQVVPDGIjGIJDEAREEEGDRKWIjYG j 


5626 


3123 


2011 ~" 


PPRALGS VAMENQVLTPHVYWAQRHRELYLRVELSDVQNPAI S I 
TEITVTjHF KAQGHGAKGDNV Y E FHLEFLD LVKPE PVYKLTQRQVN 
ITVQKKVS QWWBRLTKQEKRPLFLAPD FDRWLDESDAEMELRAK 
EEERLNKLRJLESEGS PETLTNLRKGYLFMYNLVQFLGFS W I FVN 1 
LTVRFCII^KESFYDTFHTVADMMYFCQMLAWET 
SPVLPSLIQLLGRNFILFI I FGTMEEMQNKAVVFFVFYLWSAIE 
I FRYS FYMLTCIDMDWKVLTV7LRYTLWI PLYPLGCLAEAVSVIQ | 
SIPIFNETCRFSFTLPYTVKIKVRFSFFIiQIYIiIMIFLGLYIITF ! 
RHLYKQRRRRYGQKKKKIH | 


5627 j 


3123 


2011 


PPRALGS VAMENQ VLTPHVYWAQRHREL YLRVELS DVQNPAI S I 
T ENVLH F KAQGHGAKGDNVY E FHLEFLDL VK P EP VY KLTQR Q VN J 
ITVQKKVSQV7WERLTKQEKRPLFLAPDFDRWLDESDAEMELRAK 
EEERLNKLRLESEGS PETLTNLR KG YLFMYNLVQFLG FSWI FVN 
LTVRF C I LG KES FY'DT FHTVADMMYFCQMLAVVET I NAA I GVTT 
SPVLPSLIQLLGRNFILFIIFGTMEEMQNKAWFFVFYLWSAIE 
I FRYSFYMLTCIDMDWKVLTWLRYTLWI PLYPLGCLAEAVSVIQ 
S I PIFNETGRFSFTLP YPVKI KVRFSFFLQI YLIMI FLGL YINF 
RHLYKQRRRRYGQKKKKIH j 


5628 • 


75 


1ACC 


VAU/^J'aA^KCijKAGFSSGSLKSPGGASGGSTRVSAMYSSSPCKLP t 
S LS PVARS FS ACSVGLGRSS YRATS CLPALCLPAGG FATS YSGG | 
GGWFGEG I LTGNEKETMQS LNDRIiAGYLEKVRQLEQENAS LES R 
IREWCEQQVP YMCPDYQS YFRTI EELQKKTLCS KAENARLWE I I 
DNAKLAADDFRTKYETEVSLRQLVESDING LRR I LDDLTLCKS D 1 
LEAQVES LKEE LLCLKKNHEEEVNS LRCQLGDRLNVEVDAAP P V 
DLNRVI^EEMRCOYETLVEN^^RRDAEDWT.Ty^05?T^KTlN^K^^/V<^Q<?P 
QLQSCQAE I IELRRTVNALEIELQAQHSMRDALESTLAETEARY 
S S QLAQKQCM I TNVEAQLAE I RADLERQNQE YQVLLDVRARLEC ^ 
E INT YRGLLESEDSKLPCNPCAPD YS PSKSCLPCLPAAS CGPSA 
ARTNCSARPICVPCPGGRF j 


5629 


2287 


93 8 


GRPRS S SDN RN FLRERAGLS S AAVQTR I GNSAAS RRS PAAR P PV 
PAPPALPRGRPGTEGSTS LS APAVLWAVAVWVWSAVAWAMA 
NYIHVPPGSPEVPKLNVTVQDQEEHRCREGALSLLQHLRPHWDP 
QEVTLQLFTDG I TNKL I GCYVGNTMED WLVRI YGNKTE LLVDR j 
DEE VKS FRVLQAHGCAPQLYCTFNNGLCYE FIQGEALDPKHVCN 
PAIFTILIARQIJ^HAIHAJ^GWIPKSNLWLKMGKYFSLIPTGF { 
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SEQ 
ID 
NO: 


Predicted 

beginning 

nucleotide 

location 

cor r e sponrii ng 

to first 

amino acid 

residue of 

amino acid 

sequence 


Predicted end 
nucleotide 
location 
corre sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
{A=Alaniae, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl a la nine, G=Glycine, 
H=Histidine, I=lsoleucine, K=Lysine, 
LsLeucme, M=Metniomne, N=AEparayine , 
P= Proline, Q=Glutamine, R^Arginine, 
S-Serine, T» Threonine, V-Valine, 
H=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








ADEDINKRFLSDIPSSQILQEEMTWMKEII^NlXSSPVVliCHNDL 

i «t , k, im i ± ZMiEttiJJl&iJviJE J. U I I oOin x uMii/iuitiir Hiirnuviauv 

DYSLYPDRELQSQVfLRAYIxEAYKEFKGFGTEVTEKEVE ILFIQV 
NQFAIASHFFWGLWAIjIQAKYSTIEFDFIXJYAIVRFNQYFKMKP 
BVTALKVPB 


5630 


1194 


278 


GFWAIAQTCAHHIjPPGS PWLVPAS PWRLPEMSSFGYRTLTVALF 
TLICCPGSDEKVFEVHVRPKKIAVEPKGSLEVNCSTTCNQPEVG 
GLlS I SliDK J LLDEQAQWKH x L vbH .LoMUM. vi^yL^H" l\.ov>A.Uc.ori 
KSNVSVYQPPRQVILTLQPTLVAVGKSFTIECRVPTVEPLDSLT 
LFI/FRGNETLHYETFGKAAPAPQEATATFNSTADREDGHRNFSC 
LAVLDLMSRGGNI FHKHSAP KMLE IYEPVS DSQMVT IVTWSVL 
LSLFVTS VLLCFI FGQHLRQQRMGTYG VRAAWRRLPQAFRP 


5631 


1053 


1 290 


SRVDDFVRPEPSRAEPSRSGRRRPARRAATMSVFGKLFGAGGGK 

AG KGGPTPQE A I QRLRDTEE MLS KKQE FLEKK.I EQELTAAKKHG 

TKNKRAALQALKRKKRYEKOIAQI DGTLSTIBFQREALENANTN 

TFA^KNMGYAAKAMKAAHDNMDIDKVDEL^ 

TAJ S KP VG FGE E FDEDELMAELEELEQE ELDKNLLE I SG PETVP 

LPNVPS lALPSKPAKKKEEEDDDMKELENWAGSM 


5632 


3 


952 


WLGWS P PRRLWWGS LGAAQR PAVPVSGLAKSLHV KTKR PH KKA 
SVRVARGRIXWWAQPQPLLPRPVGSRREMQPPGPPPAYAPTNGD 
FT FVS SADAEDLSGS IAS PDVKLNLGGDFI KESTATTFLRQRG Y 
GWLLEVEDDDPEDNKPLLEELDIDLKBIYYK1RCVLMPMPSLGF 
NRQWRDNPDFWG P LA WLFFSM I SLYGQFRWS WI I TIWIFGS 
LT I FLLARVLGGEVAYGQVLGVIGYSLL PLIV IAP VLLWGS FE 
WSTLIKLFGVFWAAYSAASLLVGEEFKTKKPIiLI YPI FLLYI Y 
FLSLYTGV 


5633 

i 


771 


460 


QGCSKTMSVGRPFYRSSEFMEQLLSSHLHQVPFFCCFTVVCLCN 
CLFENSVSKLYMLCFNFFMS I FFYSLS ITKLNLI YLWGLSYQSL 
LLLLLSGHRPWGSSMV 


5634 


1446 


855 


PRATGR IRSRAAASRPRAGAG ASGAE PRSGRERSRLSGRRAPAM 
ARNTI^SRFRRVDIDEFDENKFVDEQEEAAAAAAEPGPDPSEVD 
GLLRQGDMLRAFHAALRNS P VNTKNQAVXERAQGVVLKvIjTNFK 
S SE IEQAVQSLDRNG VDLLMKY I YKGFEKPTENSSAVLIjQWHE K 
ALAVGGLGS I IRVLTARKTV 


5635 


3 


• 943 


DRGPRSTATOTGRARVSFWRFPLDPGVKNSNVQISGEKRRFRTL 

rslfhpfpvtrsgapravlvgsswpakmvapavkvargwsglal 
gvrravlqlpgltqvrwsryspefkdplidkeyyrkpveeltee 

EKYVRELKKTQLI KAAPAGKTS S VFEDP VI S KFTNMMMI GGNKV 
LARSLM I QTLEAVKRKQ FEKYHAASAEBQAT IERN P YTI FHQAL 
KNCEPMI GLVPI LKGGRF YQVP VPLPDRRRRFIoAMKWMI TECRD 
KKHQRTIjMF£ KJ^niLL»J->r»Ar JnT«U « " v AJxtciv^JyjjrlJ\inMJv*wNri^J^M. 
HYRWW 


5636 


2253 


1143 


LEDTICQHPPAEKKLYLYHRKLREVERNGI PRLPKDVFMDTHQG 
LTDVRAKVTGFSEGWDSVKGGFSS FSQATHSAAGAWS KPRE I 
ASLIRNKFGSADNI PNLKDSIiEEGQVDDAGKALGVI SNFQSSPK 
YGSEEDCSSATSGSVGANST^GGIAVGASSSKTNTLDMQSSGFD 

ali^eiqeir£tqarx»eesfetlkehyqri)yslimqtlqeeryr 
cxrlee:qlndltelhqneilnlkqelasmeekiayqsyerardi 
qeaiieacqtriskmelqqqgx^qwqleglehatarniilgklrni 

LTJVVMAVLLVWSTVANCVVPLMKTRNRTFSTLFLVVF IAFLWK 
HWDALFSYVERFFSS PR 


5637 


948 


2532 


MSFCGARANAKMMAAYNGGTS AAAAGHHHHHHBliLPHLPP PHI^ 
HHHHPQHHLHPGS AAAVH PVQQHTS S AAAAAAAAAAAAAMLNPG 
QQQPYFPS P APGQAPG P AAAAPAQVQAAAAATVKAHHHQHSHH P 
QQQLDIEPDRPIGYGAFGVVWSVTDPRCGKRVALBCKMPNVFQNL 
VSCKRVFRELK>ILCFFKHDNVLSALDILQPPHIDYFEB I YWTE 
LMQSDLHKI IVS PQPLS S DHVKVFLYQ I LRGLKYLHSAG I LHRD 
I KPGNU^VNSNC^TI^CTFGI^ARVEELDESR 
PEILMGSRHYSNAI DI WS VGCI FAELLGRRI LFQAQSPI QQLDL 
1 TDLLGTPSLEAMRTACEGAKAH I LRG PHKQPS LP VLYTLS S Q A 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corre sponding 
to first 
amino acid 
residue of 
| amino acid 
sequence 


] Predicted en5 

I nucleotide 

1 location 

1 corresponding 

1 to first 

I amino acid 
residue of 
amino acid 

[ sequence 


j Amino acid segment containing signal peptide 1 

I (A=Alanine, C=Cysteine, D=Aspartic Acid, E= 

■ «iULaaut nciu, rernenyidianine^ i*=c».iycine , 1 

H=Histidine, I=Isoleucine, K= Lysine, 
I I*~ Leucine , MwMetnionine, N=Asparagine , 1 
1 P= Proline, Q=Glutamine, R-Arginine, 

S=Serine, T= Threonine , V=Valine, 
j W^Tryptophan , Y=Tyrosine, X=unknown, *=Stop j 
j Codon, /=possible nucleotide deletion, I 
j \=possible nucleotide insertion) | 








1 THEAVHLLCRML VFD P YTCR T 5% A FCD AT vAH P YT npRPT nvuTPMrif l 

J CCFSTSTGRVYTSDFEPVTNPKPDDTPEKNIjSSVRQVKE I IHQF 
II^QKGNRVPLCINPQSAAFKSFISSTVAQPSEMPPSPIiVWE | 


5638 


125 

• 


[ 1155 


DRKMSELDQLRQEAEQLXNQIRDARKACADATLSQITNN I DPVG 
RIO^IRTRRTLRGHLAKIYAMHWGTDSRLLVSASQDGKLI IWDSY j 
TTNKVHAI PLRSSWVMTCAYAPSGNYVACGGLDin: CS IYNLKTR 
1 EGNVRVSR^LAGHTGYLSCOIPLDDWQIVTS SGDTTCALWD IET 
GQXJTTTFTGHTGDVMS LSLAPDTRLFVSGACDASAKLWDVREGM 

KTYSHDNI I CGITSVS FSKSGRLLLAG YDDFNCNVWDAIiKADRA 
GVLAGHDNRVS CIXSVTDDGMAVATGS WDSFLKI WN j 


5639 


125- 


1155 


DRKMSELDQLRQEAEQLKNQIRDARKACADAT^ I 
RI QMRTRRTLRGHLAKI YAMHWGTDSRLLVSASQDGKLI IWDSY 
TTNKVHAI PLRS S WVMTCAYAPSGN Y VACGG LDN I CS IYNLKTR I 
EGNVRVSRELAGHTGYLS C CRFLDDNQI VTS SGDTTCALWD I ET j 
«Wi ^A"t^l\iH\l\iJJVMSI>SIxAPDTRL I 

CROTPTCHESDINAICFFPNGNAFATGSDDATCTUjFDLRADQEIj 
MTYSHDNI IC^ ITSVSFSKSGRLLIiAGYDDFNCNVWDALKAURA [ 
I o V JjA^HIJl>rRV5CIjGVTD | 


5640 

i 


280 


1092 


QQGNKKTMLSHNTMMKQRKQQATAI MK2VHGNDVDGMDLGKKVS 
IPROIMI.EEl^Sm^NRGARI.FKMRQRRSDKYTFENFQYQSRAQI I 
HHSIAMQNGKVIXSSNLEGGSCXSAPLTPPNTPDPRSPPNPDNIAP 
GYSGPLKEIPPEKFNTTAVPKYYQSPWEQAISNDPELLBALYPK 
LFKPEGKAELPDYRSFNRVATPFGGFEKASRMVKFKVPDFELX>L. I 
LTD PRFMS FVNPLSGRRS FNRTPKG W I S ENI P IVITTEPTDDTT j 
VPESEDL | 


_J O T. JL, 


f~ ?7 1 




v-KHWUNC»DvKijljSNQMDKLFArHL J 
EIILSDNSSILVLENNFLFKVKSKQFIHLIAKKFYISITIVSAS 1 
NGESFVLSMIVTG | 


5642 


199 


1247 


ITP CRMDFLVLFLF YLAS VLMGLVL I CVCSKTHS LKGLARGGAQ 
IPS CI ZPECLQRAMHGLLHYLFliTRNHTFIVLHLVIjQGMVYTEY 1 
TWEVFGVCQELEliSLHYLLLPYLLLGVNLFFFTLTCGTNPGIIT j 
KANBLLFLHVYE FDEVMFP KNVRCS TCD LRKPARSKHCSVCNWC 
VnKcJ;nnLwWvWHLlbAWWJ.Kl rLll VJLj i iiiAoAAi VAX Vi> 111* 1 

LVHLWMSDLYQETYIDDLGHLHVMDTVFLIQYLFLTFPRIVFM [ 
LGFVVVI^FIJ^YLLFVLYIJVATNQTWEWRGDWAWCQRCPL 1 
VAWPPSAEPQVHRNIHSHGLRSNLOEIFLPAFPCHERKKQB f 


5643 

■ 


1 J 


847 j 


PSGGVRDVBTRGPGSRAARGPRVVMKFJiGVGAGAIAiCKKIjAEAK I 
YKERGTVTiAEDQLAQMSKQLDMF^ 

RVQFQDMCATIG VD PLASGKG FWSEMLGVGDFYYELGVQI I EVC 1 
LALKHRNGGLITLEELHQQVLKGRGKFAQDVSQDDL I RA1 KKLK 1 

ALGTGFGT TPVnr:TVT,.TnQVPAT^T.MMnHT r V7VT/lT>AV V>jriVV T T r U'Q 1 
* vj x x r vvj\j x x xi^yc v xr -t\ r» i iivi'ii^n a v v l^ju/UiIuiu i v _l V ^> i 

EIKASLKWETERARQVLEHLLKEGLiAWliDLO^^ j 
TDLYS 0E I TAEEAREALP I 


5644 


83 1 


113 8 


PRRMGS W VQLITS VGVQQNHPG WTVAGQFQS KKR FTEEVI E Y FQ 
XKVS PVHLKI LLTS DEAWKRFVRVAELPREEADAL YE ALKNLTP 
YVAI ED KDMQQKEQQ FREW FL KE FPQ I RWK IQES I ERLR VIANE 
IEKVHRGCVIANVV£C;^TnTT.!S\n!GVMTJ^^ TTAACTV I 

GLGIASATAGI ASS IVENTYTRSAELTASRLTATSTDQLEALRD 
I LHDITPNVLS FALD FDEATKM I ANDVHTLRRS KATVGR PL IAW 
RYVPllWVETLRTRGAPTRIVRKVARNLGKATSGVLVV^^ 
VQDSIJDl^GEKSESAEI^QWAQEI^ENLNELTHIHOSLKM J 


5645 


537 


799 


VQSVRDLKKLSPTDPPGDSGI^DVTREDPVTGPLNSASSQVPTL \ 
YLCIjQNSI^HSSVFJJARATMELYQISQRIRARRGLPRLAVSD 


5646 


3745 


3328 


AEQYGTSPHLLPT^LSSCLPPANVTTKAATPPPLVLSLTTADP j 
AG KP AP CRVTLTLLRAS I PATKRASFLSS FI KMFFEELE YILGF 
LSLLKFHVH^SVYSAIOTFOKFXSTGNSRSFTCTPELFPRI^THL j 
RAEGGAQ 


5647 


288 


800 


GVI MATS ELS CEIVSEENCERREAFWAEWKDLTLSTRPEEGCSLH I 
EEOTORHETYHQQGQCQVLVQRS P WLMMRMG ILGRGLQE YQLP Y | 
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SEQ 
ID 
NO: 


Predicted 

beginning 

nucleotide 

location 

cor respond! ng 

to first 

amino acid 

residue of 

amino acid 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Anino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E=> 
Glutamic Acid, F=Phenylalanine, G^Glycine, 
H=Histidine, I^Isoleucine, K»Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P= Pro line, Q=Glutamine, R=Arginine, 
SsSerine, T=Threonine, VWaline, 
W=Tryptophan, Y=Tyrosine, X=UnJcnown, *=Stop 
Codon, /=possible nucleotide deletion, 
V=possible nucleotide insertion) 








Q R VLPL P I FT P AKMGATKEE REDT P I Q LQ EL LALET ALGGQCVD 
RQEVAE I T KQLP P WPVS KPG AIxRRSLS RSMSQEAQRG 


5648 


7 


1518 


VLS3IX:GRHEAI*REVGAEWPPPTCSPKICSGLQQAGNTDWSLTM 
APQS LPS S RMAPLGMLLGLLMAACFTF CLS HQNLKE FALTN PE K 
SS T XETE RKETKAEEE LDAEVLE V FHPTHE WQALQ PG QAVPAGS 
HVRLNLQTGEREAKLQ YEDKFRNNLKG KRLD INTNTYTSQDLKS 
AI^KFKEGAEMESSKEDKARQAEVKRLFRPIEELKKDFDELNVV 
IETDMQIMVRLINKFNSSSSSLEEKIAALFTJLEYYVHQMDNAQD 
LLS FGGLQ WINGLN STEP LVKE YA^FVLG AAFSSN P KVQVEAl 
BGGALQ KLLV I LATEQPLTAKKKVLFALCSI*LRHFPYAQRQFLK 
LGGLQVTjRTLVQEKGTEVLAVRVVTLLYDLVTEKWFAEEEAELT 
QEMSPSKLQQYRQVHLLPGLWEQGWreiTVTUXALPEHDAREXV 
LQTLG VLLTTCRDRYRQD PQLGRT1*ASLQAEYQVIASI»ELQDGE 
DEGYFQELLGSVNSLLKELR 


5649 


1172 


3006 


KLQEQLDAINEEIRM I QEEKESTELRAEE I ETRVTSGSMEALNL 
KQLRKRGSIPTSLTDLSLASAS PPLSGRSTPKLTSRSAAQDLDR 
MG VMTL PSDLRKHRR KLLS PVSREENRED KAT I KCETSP PS 55 PR 

tlrleklghpalsqeegksaledqgsnpss snssqdslhkgakr 
kgi kss igrxi fgkkekgrl i qlsrdgatgh vlltds e fsmqepm 
vpaklgtqaekdrri,kkkhqlledarrki^pfaqwdgptvvswl 
elwvgm pawyvaacranv ksgai msal3 dte i qre i g i snalhr 
ijo^riaiqemvsltspsapptsrtssgnvwvtheemetletstk 
tdseegswaqtiaygdmnhewignewlpsix3lpqyrsyfmeclv 
darkldhltkxdlrvklkmvds fhrtslqyg i m clkrlnydrke 
lexrreesqhe i kd vlvwtndqvvhv7vq s i g lr d yagnlhes g v 
hsallalden fdhntlal i lq i ptqntqarqvmere fnnllalg 
tdrklddgddkvfrrapswrkrfrprehhgrggmlsasaetlpa 
gfrvstlgttjQPPPappkkimpeahshylyghmlsafrd 


5650 


1172 


3006 


l^EQLDAINEEIRMIQEEKESTELRAEEIEOTVTSGSMEALNL 
kqlrkrgs I PTSLTDLSLASASPPLSGRSTPKLTSRSAAQDLDR 

mgvwtxpsdlrkhrrkt .tkspvsreenredkati kcetsppsspr 
tlrleklgh pals qeegksaledqgsnpss snssqdslhkgakr 
kgikssigrlfgkkekgrliqlsrdgatghvlltdsefsmqepm 
v?aklgtqaekdrrlkkxhqlledarrkgmpfaqwdgptwswl 
elva^pawyv7aacranvksgaimsalsdteiqreigisnalhr 
lklrlaiqemvslts psapptsrtssgnvwvtheemetletstk 

TDSEEGS WAQTLAYGDMNHE W IGNEWLP S lglpqyrs yfmeclv 
DARMLDHLTKKDLRVHLKMVDS FHRTSLQYGIMCLKRLNYDRKE 
LEKRREESQHEIKDVLVWTNIXJVVHWVQSIGLRDYAGNLHESG^ 
HGALIiALDENFDHNTl^LILQI PTQNTQARQVMERE FNNLLALG 
TDRKLDDGDDKVFRRAPSWRKRFRPREHHGRGGMLSASAETLPA 
GFRV3TIX3TIjQPPPAPPKKIMPEAHSHYI*YGHMLSAFRD 


5651 


646 


1869 


ARQGQRQ PWG * EARAKGPASES PRV * EGSGWEGP AS P * TPGSTL 
AWGEGAG I R*ASGLTAAGAASAAAA/ PPPTRGG PAPAGCGRAPP 
WPAPLRVPTHGRAPAPRSRAAPRAPALSHGTAAAALSPASPAGP i 
ADP*LPGHSSQS PPRG * RWGRS RS APAPAH PEHPAPAGS ASASQ 
OTPGWPGSCCLAQGWQAEPI/SAPGAEDG\PVPPQRGFPLGTLGS 
PAGSWAGLAG YG * AGAPGTQATAPRAAGQT PVAAAPNCRV* GSA 
PAIjHRAPAAADPGSPLQAPPRAWASPAAAGPGLSSSDYCGGLGA 

CMPSPPVEGSI/3LSRKGHGDLPSQAR*GWHECRRARHLVPLPRL 
LGPRGRTGRPSSPS 


5652 


735 


343 


HHKKYQ HIHQ KS FS C P E P ACGKS FN FKKHLKEHM KLHSDTRD Y I 
CE PCARS FRTS SNLVIHRR I HTGE KPLQCE I CG FTCRQKAS LNW 
HQRKHAETVAAIJIFPC^FCGKRFEKPDSVAAHRSKSHPAIjLLA 


5653 


66 


1401 


RGRLQSRGRLTLGLVLLLLDILGARQHGQRVSHGWKGGFLTAPL 
CTPQPCQPGTRRGRRR^LKEATEPQLAMAEEFVTLKDVGMDFTL 
GDVTEQIXSLEGGDTFVnyrALDNCQDLFLLDPPRPNLTSHPDGSED 
LEPLAGGS PEATS PDVTETKNS PLMEDFFEEGFSQEI / SRDVTQ 
GMLLELQFRRSLYRGHLVR* FARRSRKS SEV * YCHQRGKSHGMQ 
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I SEQ 
ID 
I NO: 


Predicted 
beginning 
nucleotide 
location 
corre sponding 
to first 
| amino acid 
residue of 
amino acid 
sequence 


j Predicted end 
nucleotide 
location 

1 corresponding 
to first 

1 residue of 
1 amino acid 
sequence 


1 Amino acid segment containing signal peptide 1 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine. i 
H=Histidine, I-Isoleucine, K= Lysine, 

j I>=Leucine, M=Methionine, N=Asparagine, 

i I'sFroiine, Qs=Glutamxne , -R=Arg;inine , 1 
S=Serine, T=Threonine, V-Valine, 1 
W« Tryptophan, Y=Tyrosine, X= Unknown , *=Stop j 
Codon, /^possible nucleotide deletion, j 

1 \=poss±ble nucleotide insertion) 1 








1 **z> " i iSJiK X\ji> U v tlK Jt'HGKKFHG \ DN V S IS KTLTP AKS KE YRGEF P 1 
SYSDHSQQDS VQEGEKP YQCSECGKS FSGS YRLTQHW I THTREK 

j PTVHQECEQG FT5RKASHSGYPKTHTGYKFYVCNE YGTP FS QSTY 
LWHQKTHAGEKPCKSQDSDHPPSHDTQSGBHQKTHTDSKSYKCN 
EGGKAFTRI FKLTRHQKIHTRKRYECSKCQATFNLRKHLIQHQK 
THAANV • j 


5554 


3 


j 598 


j TLPLFPGRRFRGWRRCGAVAAR KNSTGGNVS I NQRRDS VRMSAL j 
NWKPFVYGGItASITAEOTTFPIDLTKTRFQIQGQTNIlAKFKEII j 

1 YRGMLHALVRIGREEGLKALYSG * VGLHAFLCHCSLFHMGIDFR I 
PRLHRSQVKSLRCV* KEQIA* * /MFSL.LISTLISKYI YYAADVL 

j EKLFYY IQVQTDNNKK I CLFKN I | 


5655 


2 




RPPGIRAPRQLHPAAGRRPDASARPRFRPTVLLHDPFQLSFPPP 
PLSYPSVFPAVARVLPQRSGDYRAAGMPQLSGGGGGGGGDPELC j 
ATDEMI PFKDEGDPQ\REKI FAEI VNPEEEGDLADI KSSLVNES 
EI I PASNGHEVARQ AQTS Q E P YHD KAREH P DDG KHPDGG 1. YNKG 

j PSYSSYSGYIMMPNMNNDPYMSNGSIiS PPI PRTSNKVPWQPSH 
AVHPLTPLI TYS DEHFS PGSHPSH1 PS DVNS KQGMSRHP PAPDI 

[ PTFYPLSPGGGGQITPPLGWQGQP j 


5656 


228 


| 1066 


j PRR VP PIiPE FAS G PGAAF FHSGRLQRS LTKDS AGCFSQCRS RAM 
IiVLRSGLTKALASRTIiAPQVCSSFATGPRQYDGTFYEFRTYYTiK 
PSNMNAFMENI*B1KNIHLRTS YSELVGFWSVEFGGRTNKVFH I W K 
YDNPPHRAEVRKALiANCKEWQEQSIIPNLARIDKQETEITYLIP 
WSKLQKPPKEGV YE LAVFQMKPGG PALWGDAFE RAINAHVNLG Y | 
TKWGVFmEYGELMRUHVLWWNESADSRAAVRHKSHEDPISWG 
GVRESVNYIAVSQQNM j 


5657 


105 


1052 


GQRLQS PR VQMP VQ PPS KDTE EMEAEGDS AABMNGE EEES EEER 
SGSQTESEEESSEMDDEDYERRRSECVSEMI*DI*EKQFSEIiKEKXi 
FMRl^QLRLRLEEVGAERAPEYTEPLGGLQRSLKIRIQVAGIY 
KGFCIiDVIRNKYECELQGAKQHLESEKLIiLYDTLQGELQERIQR j 
LEEDRQS ZjDLS SEWWDDKIiHARGSS RS WDSLPPSKRKKAPLVS G 
P Y I VYMLQEID I IiEDWTAI KKARAAVS PQKRKS D \DLDPAVHS Q 
GDPQSSWHCTQDSRLPPADRRTHRPLRVCPARLLWCCWALPLHL 
ALVWTPPL 


| 5658 


2346 j 


3541 [ 


TERRVYN PWPE PDPD \C I QEDP WNLPNS I KTLVDN I QRYVEDG K j 
NQLLLALX.KCTDTELQLRRDAI FCQALVAAVCTFSEQLLAALGY 
RYN^GEYEESSRDASRKWLEQVAATGVLLHCQSLLSPATVKEE 
RTMLEDI WVXLSELDNVTFS FKQLDENYVANTNVFYHIEGSRQA 
LKVIFYI^SYHFSK1*PSR1^GGASLRI.HTALFTKVLENVEGLPS 
PGS QAAEDLQQD INAQSLE KVQQ YYRKX, RAFYL E RSNLPTDAS T | 
TAVKI IXJL IRP INAItDELCRLMKS FVHP KPGAAGS VGAGLI PIS 1 
SEI/ZYRI^ACQMVMCGTCMQRSTTiSVSLEQAAILARSHGIiliPKC 
IMQATDI MRKQG PRVE IIAKNLRVKDQMPQGAPRLYRIjCQPKMN 
GDI> J 


1 qCCQ 


i 1 
! 


£ Q £Z 1 
I 


WKRSGEVS P KGELGAWRGNSGRPKI IGRAAEAENEDRTLGRJbI*P | 
GNERSQPRSPLRLI*APQLKAEAAADKGLAPVPPPFSSGHSGPC\ 
EREGEGQRGRGRSRRGAHLELKPSPGLRAGAPTDRGRGGPABVA 
AAGGRRMVQKESQATLEERESELSSNPAASAGASLEPPAAPAPG 
KDNPAGAGG \ AA vAGAAGGARRr LCGWEG FYGRP WVMEQRKEIj j 
FRRIjQKWELMTYI» j 


[ 5660 


229 ! 

mm mm 1 


853 

\J 

* 

» 


PVTMWnp^FTiPMPT.T.TMr Tv<iTT/^irifn'r , v*rT.TDavprrupT2ia'DT 1 
CGQDLNKTSRQQIPESQGVI sgavfli ilfcfipfpflncfvke 1 

QRKAFPHHE FVAL I GALLA I CCM I FLG FADDVLNLRWRHKLLLP \ 
TAASLPLIiMVYFTNFGNTTIVVPKPFRPI S YHCC j 
PYGTYFRE P FLVLHI LLQVFLFCliCVF PD P FW j 


5661 


2 j 


473 


IjNLYPSPCGGIPKLPGiiPREAAAALGAS FLAEAPLPVTVRGSGI* I 
AGMAVTCD P KAFI*S I CFVTLVFLQDPLAS I CQN* GTDS CAS RG K 
ADFDVTGPHAPIIiAMAGGHVELQCQLFPNI SAEDMELRl^YRCQP j 
SLAVHMHERGMDMDGEQKWQYRGRT 


5662 


2 | 


1318 


I*RKEGRCRRGSOT?GVWAAPAEGLGGRGMLGVRC1.LRSVRFCSSA 
PFPKHKPSAKLSVRDA1X3AQNASGERIK1QGWIRSVRSQKEVL.F j 



357 



WO 01/53312 



PCTAJSOO/34263 



SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corre spending 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
<A=Alanine, C=Cyateine, D=Aspartic Acid, E= 
Glutamic Acid, F— Phenylalanine , G=Glycine, 
H=Histidine, I=lsoleucine, K= Lysine, 
L= Leucine, H=Methionine , N=Asparagine , 
P=Prol ine , Q=Glutamine , R=Arginine , 
S-Serine, T=Threonine / V=Valine, 
K=Tryp tophan , Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 






• 


LHVNDGSSLESLQVVADSGIjDSRELTFGSSVEVQGQLI ks pskr 
QNVELXAE KI KVIGNCDAKDFP I KYKERHPLE YLRQ YPHFRCRT 
NVLGS ICR IRS EATAAIHS FFKDSG FVHIHTPI I TSNDS EG AGE 
LFQl^PSGKLKVPEENFFNVPAFLTVSGQimEVMSGAFTQVFT 
FGPTFRAENSQSRRHIiAEFYMIEAEISFVDSLQDLMQVIEEIiFK 
ATTMrf VLS KCPEDVELCHKF I APGQKDRIi * HMLKNNFI* I ISYTE 
AVEILKQASQNFTFTPEWGADLRTEHEKYLVKKCGNIPVFVINy 
PLTLKP FYMRDNEDGPQELEGSVA* HSLGLM ILLS IWIGQP 


5663 


119 


698 


PADIGRSTAKTPGPPRSLEMDDPRYGMCPLKGASGCPGAERSLL | 
VQSYFEKGPLTFRDVAIEPSLEEWQCLDSAQQGLYRKVMLENYR 
NLVFLG IALTKPDLITCLEQGKEPWNT KRHEMVAKPPVI CSIIFP 
QDLWAEQDIKDSFQEAILKKYGKYGHANFQLQKGCKSVDECKVH 
KEHDNXLNQCLI PKKKK 


5664 


118 


572 


SLSMESNHKSGDGLSGTQKEAAIiRALVQRTGYS LVQENGQRKYG 
GPPPGWDAAPPERGCEI FIGKLPRDLFEDELI PLCEKIGKI YEM 
RMMMD FNGNNRG YAFVT F SN KVEAKNAI KQLNNYE I RNGRL LGV 
CASVDNCRLFVGGIPKTKK 


5665 


347 


702 


WQHLI ILLHCERTSPAMITSEXPVLQDSTNETTAHSDAGSELE 
ETEVKGXRKRGRPGRPPSTNkKPRKSPGEKSRIEAGIRGAGRGR 
ANGHP QQNGEGE P VTLFE WKI/3KSAMQRC 


5666 


213 


540 


VSCLPTSCKMITLNNQDQPVPFNSSHPDEYKIAALVFYSCIFII 
GLFVN ITAIiWVFSCTTKKRTTVTI YMMNVALVDLI FIMTLPFRM 
FYYAKDEW P FG EYFCQI LGA 


5667 




695 


HPLPS ASLGLPS VS LGVSLCVRS ALLEA WPHL P KRRRARVGS P 
SGDAAS STPPSTRFPGVAI YLVEPRMGRSRRAFLTGLARS KG FR 
VLDACSSE ATHWMEETSAE BAVS WQERRMAAAP PGCTP PALLD 
ISWLTES LGAGQP VPVECRHRI*EVAGP S KGPLS P AWMPAYACQR 
PTPLTHHN1X3LS EALEI LAEAAG FEGS EGRLLTFCRAAS VLKAL 
PSPVTTLSQLQ 


5668 


691 


894 


CSFLFCI PDLFLQFLLGRKEEEAVLVGGEWSPSLDGLDPQADPQ 
VLVRTAI RCAQAQTGIDLSGCTKW 


5669 


407 


1 


DSGAPEGLSPLMSTQEGLSMHAHPQAYTPFIYLHARKRRGEIGD 
ADSRFNDRYAHKSAQLYFLYFVCWIFQDVYYFTIKEKNHFFFPK 
ARGAPTKYSGS PIGSPTTTPPTRP PS FNIiHPAPBa.IiASMQLQKI* 
NSQ 


5670 


3 


373 


SSECLTMAWIPLLLPLLILCTVSVASYELAQPSSVSVSPGQTAK 
ITCSGDVLAKKYARWFQQKPGQAPVLVIYKDTERPSGI PERFSG 
S TSGTTVTLTI SGAQVEDEAD Y FCYS ATDNFLWVF 


5671 


280 


524 


KFPPKKTP PHIiGMES AITLV7Q FLlLQLLLDQKHEHL I CWTSNDGE 
FKLLKAKKVAKLWGLRKNKTNMNYDKLSR 


5672 


2 


557 


FVPATPDPGVWLPPSRDPAMAKRSSLYI R I VEGKNLPAKDITGS 
SDPYCI VKVDNEP 1 1 RTATVW KTLCPFX'JGEEYQVHLP PTFHAVA 
FYVMDEDALSRDDVIGKVCLTRDTIASHPKGKFSLPSHTGLPSP 
WPPSHSETSPl^SVHSPAC^KPFliLSPEAGATFCTPGLCSAACS 
QAWLLLPLP 


5673 


327 


696 


ITVADQ I SH WSAGR I KNRTRI PECIHSSAATTLAGPHTMEGESV 
KLSSQTLIQAGDDEKNGJ^TITVNPAHMGKAFKVMNELRSKQLLC 
DVMIVAEDVEIEAHRVVLAACS P YFCAMFTGDMS 


5674 


17 


984 


GGGSMEGESTSAVLSGFVLGAIJ^QHLNTDSDTEGFLLGEVKGE 
AKNSITDSQMDDVEVVYTIDIQKYTPCYQLFSFYNSSGEVNEQA 
LKKII^NVKKNWGWYKFRRHSDQIMTFRERLLHKNLQEHFSNQ 
DLVFLLLTPSIITESCSTHRLEHSLYKPQKGLFHRVPLWANLG 
MSEQLGYKTVSGSCMSTGFSRAVQTHS S KFFEEDGS LXEVHKIN 
EMYASLQEEIiKS ICKKVEDSEQAVDKLVKDVNRLKRE IEKRRGA 
Q I QAAREKNIQKDPQEMI FLCQALRTF FPNS EFLHS CVMSLKID 
MFLKVAVTTTTISM 


5675 


80 


753 


EGSRRG PTRLARLS ARAGRLHFP PGFSSRLIHFRGVSECRRP PG 
KSGVPVSAPGSIW3KWWEERPGMFSI/4ASCX^WKRWREPVRKVT 
LI*WVGLD1^GKTATA1CGIQGE YPEDVAPTVG FSKINLRQGKFEV 
TI FDLGGGIRI RGI WKNYYAESYGVIFVVDSSDEERMEETKEAM 
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l^EQ^ j Predicted " 
ID I beginning 
HO: j nucleotide 
location 
correspond i ng 
to first 
amino acid 
residue of 
amino acid 
sequence 



Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



Amino acid segment containing signal peptide 
(A= Alanine, C= Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F-Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=I»ysine , 
L=Leucine, M=Methionine, N=Asparagine, 
P= Proline, Q=Glut amine, R«Arginine, 
S=Serine, T=Thr-eonine , V=Valine, 
W=»Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, * /^possible nucleotide deletion, 
\= possible nucleotide insertion) 



SEMLRHPRI SGKP ILVLANKQDKEGALGEADVIECLSUSKLVNE 
HKCli 



5676 



930 



FVSSPPPRPVQPARPGGFGLSGRRSUiCQVASTPAHVGVMRSPV 
RDLARNIXjEESTDRTPLIjPGAPRAEAAPVCCSARYmiAILAFFG 
PFIVYALRVNIiSVALVDMVDSNTTIiEDNRTSKACPEHSAPIKVH 
HNQTG KKYQ WDAETQGW I LGSFFYGY I ITQIPGGYVASKIGGKM 
LLGFGILGTAVLTLFTP IAADLGVGPLIVLRALEGLGEGVTFPA 
MHAMWSSWAPPLERSKI>I>SI5YAGAQLGTVISLPLSGIICYYMN 
VfT YVFYFFGTI G I F WFLLWI WLVSDTPQKHKR I SHYEKE YI LS S 
L 



5677 



1028 



PPRDGFLELRRI-SVPIjCSGPCPIjTSLSROGERSGGHIjVAAARAA 
VTAETHPLPLLAPLAVCQSVKS PAACQVRPRPRAVALPAALGG P 
GRS1»PGLTAATMSS FSESALEKKLSELSNSQQSVQTLS LWLIHH 
RKHAGPIVSVWHREI*RKAKSNRKLTFLYLANDVIQNSKRKGPEF 
TRE FES VLVDAFSHVAREADEG CKKPLERLLN I WQERS VYGGEF 
I QQIj KLSMEDS KS F PP KATEE KXS I*KRTFQQ I QEEEDD D YPGS Y 
S PQDPS AGPIjLTEEIj I KALQDIiENAASGDATVRQKI AS LPQEVQ 
DVSI^EKITDKSAAERLSKTVDEACLRNRGPGTS 



5678 



593 



SSSPPSSTPSLPLPFYX.LL.GQLRIiQLLWGTAHLSGAGEAAPCPG 
GSGRTAAPRTRADPAAQSlUMIMNKMiCNFKRRFSIiSVPRTETIEE 
SLAEFTEQFNQLHNRRITENIjQLGPLGRDPPQECSTFSPTDSGEE 
PGQLS PGVQFQRRQNQPJtFSMEVRASGALPRQVAGCTHKGVHRR 
AAALQPDFD VS KRLS LPMD I 



-5679" 



623 



5680 



5681 



258 



45 



592 



86 9 



39 



622 



LNSRVDDFVAVPGAIMDEDYYGSAAEWGDEADGGQQEDDSGEGE 
DDAEVQQECLHKFSTRDYIMEPS IFNTLKRYFQAGGS PENVIQL 
LSENYTAVAQTVNLIiAE WL I QTGVE PVQVQETVENHLKS L«X» I KH 
FDPRKADSIFTEEGETPAWLEQMIAHTTV^DLFYKLAEAHPDCL 

MLNFTVKVGRVLEIiRRKWMNVYFWLLVCFL 

RRLTS TSEKLQNRNSHTP LESTL IHPQPS Y KG FG I MFGKKKKKI E 
ISGPSNFEHRVHTGFDPQEQKFTGLPQQWHSLLADTANRPKPMV 

DPSCITPIQLAPMKTIVRGNKPC 
LLCAKTLGVRTKESQAEGYNRSGINNHQAEDPRFCPSFCWMRSA 
RQTRPQRLRKEAARPPTPGSCPGGTGMDGKKCSVWMFLPL.VFTL 
FTSAGLWI VYF IAVEDDKI LPLNSAERKPGVKHAPYIS IAGDDP 
PASCVFSOVMNMAAFLALVVAVjLK FI Qt»KPKVLNPWLNI SGLVA 
IiCLAS FGMTLLGNFQIiTNDE E IHNVGTSLT FGFGTLTCW I QAAI> 
TLKVN I KNEGRRVGI PRVILSAS I TLCVGPLLHPHGPKHPHVCS 

QGPVGPGHVL 

PSRS C1K3TMRKWRHREVN1»P EVTQQDAVCPAP I PS PGLSAQTGL 
QKIWGTIHCQVCPGAPAWPGSPWHEEMGLIiLLVPLriLLPGSYGL 
PFYNGFYYS^SAlTOQNIjGNGHGlCDLI^GVKIjVVETPEETIiFTYQ 
GASV I L> P CR YR Y E P ALVS P RR VR VKWW KL»S ENG AP E KDVL VA I G 
liRHRS FGD YQGR VHUIQD 



5683 



89 



778 



5684 



19! 



677 



5685 



5686 



779 



1262 



GSCGATALITRCI^WSVLISRLAMATYTCITCRVAFRDADMQRA 
HYKTD V7HR YN1»RRKVA£MAP VTAEG FQ ER VRAQRA VAEE E S KGS 
ATYCTVCS KKFAS FNAY ENH1»KSRRH VELE KKAVQAVNR KVEMM 
NEKNIiEKGLGVDSVDKIlAMNAAIQQAIKAQPSMSPKKAPPAPAK 
EARNVVAVGTGGRGTHDRDPS E KP PRL»QWFEGQAKK1*AKHS EDD 

SEDEEHDLC _ 

twcfrgylgprV Imxa Lde P PYLTVGTDVSAKYRGAFCEAK I KT 
AKRIiVKVKVTFRHDS STVB VQDDHI KG PLKVGAI VE VKNLDGAY 
Q3AVINKLTDASW YTWFDDGDEKTLRR9 SLCLKGERHF ASSET 
LDQL PLTNPEHFGTPVIGKKTNRGRRYE 

I*LLQQPVVHCFIiLFP P FRFSHHMI PG P PG PHTi'G 1 PH PA ] 
VKQEHPHTDSDLMHVKPQHEQRKEQEPKRPHIKKPl^AFMLYMK 

EMRANWAECTLKESAAINQI LGRRWHALSREEQAKYYELARKE 

RQLHMQLYPGWSARDNYVSPSS I PVALHS 



128 



1181 



"CTWWQVNITDI^INDNHPTWKDAPYYINLVEWTPPDSDVTTVVA 
VDPDIiGENGTIjVYSIOPPNKFYSI^STTGKlRTTHAMIiDRENPD 
PHF.ACT.MRKIWSVTPCGllPPLKATSSATVFVNLLDLNDND PTF 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corre srp ond i ng 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
*(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I-Isoleucine, K*Lysine, 
L= Leucine , M=Methionine, N«Asparagine , 
P= Proline , Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V-Valine, 
W= Tryptophan , Y=Tyrosine, X= Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\s=possible nucleotide insertion) 








qnlpfvaevlegi pagvs iyqwai dldeglnglvsyrhpvgmp 
rmdflinsssgwvttteldrbriaeyqlrwasdagtptksst 
stltihvldvndetptffpavynvsvsedvpr\gsgwsg*aarn 

NDVGLNAELSYFITGGim)GKFSVGYRimVVRTWGlJ>RETTAA 
YMLIIiEAI DNG ?VG KRH TGTAT VTVTVLD VNDXRP I ILOSSYV 


S687 


17 


917 


AAPPAPPDG/PPP/PPPAPPT/PGPAA/APASSCQPRLSAGRAA 
QGDGGAAAVGHVLWPAVGPVRVNPGIiQTP VPRPELLPG P \ S SS 
IJHSDSSYPPDAGLSDDEEPPDASLPPDPPPLTVP/ADA/PMPVT 
SGCRMPSTSASE/AAGGQGACTHAKGS ETPPPASPQTSE PAPSP 
LPPHLTGGPGMYSSEAKXPNSFSCLGIAGTGAGI * GTASAHGTG 
PPVLPHVCTPSLANPQP\AVGPEASSLPLGVSGIGMSA/SAPIS 

SSPFVAIGSCWLRGI PPPGSGFLCPGRAPG PVP ITTHGQEGQGP 
VLDI 


5688 


1 


420 


LTKWDL FGNCYRLLKTG I EHGAMP EQVG VYWYS / CL» YDSRKL F F 
*SHMI IRSLL* KVTDDSLGQLPLLRELLL* *I*NVTDRCI IXjAYV 
IjRVEKTFAITYLKNFTVKVDFSIJjGEIPIjISMAAIIjKLWIMKID 

dgyipavf 


| 5689 


1504 


3 


HELSG KHISMVSGNTCNWHPGGHS PGGGGQGEITSKDRGB I PAL 
IWA/RK?IGT^fTATKPTHRAG*GGAEEYQPPPQPCEGPRSTSRG 
GEG*GHAVGPGREIGKEGSLPFLGPKALGF*SASCQRAFEGGAH 

gstarkpapatpgtrhprtmetoevaqgwpagprsqfwdqhphs 
pgehrpsg\splpacpprawpkagavasatgtg\pqlpgsrgkq 

KLPRTREPPLLQAGWAVRKP PWSEAKEGLGQAGRPSGMDSSAS \ 
PQTPGGRGSLEWGLPLYXGPHHDVK*RSDRLG* PP * GGQGGGGH 
GAPSTPGPGGEAW*LPQQTSRPKPGPQAY*GE\GSPGLQCPCSK 
EL»*RVPPGSLGPSTQC3 W CYEPTDKHS\GGADAQLEVSTAGSRSTF 
GQELKGPLDAGRLWPGAPSAS SSHR*GG* ERARAGAGHRGST* A 
SSKIEQGRPRPGPTSDAIiADVEGGAES/GPHPWPLPGTLPNR/P 
GS PPPA* AS AGRKGTVSTLGGGLL 


5690 


1424 

• 

• 


58 


P3PFAGVCAAPAPLPL1ALARRDRRPCSPGAEAAPWQTGGPAID ~" 
GAWRTS VSALRRGATG/APCS PGAEAAP WQTGG PAI DG\DGELP 
* VRSEEAPRGCGAEGGGPGSGP VRR PGAGRGAHAGQGRQQDPEP 
DGLRHRQHGAASHARHRiQRLRPGHHQNRHVRRDPQAPPGGPAP 
GHAAAL P ERTRG VAE P P AWAHAGS DAWRAG R * SQRT * ERAR PRH 
PTFQGRAGS \GQPG YQ PPNPH PGPSS P PAAP\GPRGA* GNPQLE 
KAPRSDRNPSQGLRTRIRRPETPDCGPPSPAGSSASASTFRCTS 
SLS LX>G P / PGAHNLDTAPQDR * HGP * GDKRGAPG VAGED PR P P * 
GNFVR * LLLMP/GVA* RHGTS PFLGPSLGENGGQWDSGNLFGTP 
KG * SHPAFTKST* SMEAEKS YWNHPHR\DRGRQGVR INCLRVGE 
SEMWGPYSAPRPGTVFIjSSFLSPASEEH\PEGSSSFNTPFPPAG 
PEGDPGLNSPGLLP 


5691 ! 


107 


550 


ISNDPS PGYNIEQMAKRGKKLVEIiPYTVKGMDVSFSG ILS FIBD 
VAHRMLATGECTPEDLCFSLQVMQ * KTGTESWG*RFYI VEQN* S 
GDAPLIFSPYLSLTGNCGFA^VEITERAMAH\C!GSPGGPSLWG 
GVGVYVLLESVPLSYS 


5692 • 


1193 


548 


TQAWTRAEKX)RKGSVRALRIjHLERGPPT*RGSHPL\QSVPCIQK~" 

PS I fss ypi /glpqsggepgpvgeqqpvrrpeqpscgpasrmpl 
tsrsvppgrgalppdslstrkglprpstaghrvresghkvpvsq 
rlnlp vmgatrsnlqpprkvavpgptr * rdqds kqdfss kplqs 

V jt , t»l_iA£> 1 Uv A -1 i*AL»i> Vjrtt* 1 \ityKJJA I KAKiLPLr V&TMGONG VD 


5693 


1258 


1330 


ALTVVPVRKGTTWWAQPHGCSNLVSRARLDLSSRPSQNTEPQAP ~ 
*QAGPPSSLRPP\SRRR*APEWPKRATGSRCRGLSAPPWPWPAA 
RGB/ PGSAPSHAP/PNS PRPSGTRHP/PGPSSRVLYS PSLPRNS 
PEAIVWRSSRFPLWFPLRCCFWVSGFKDPNPVLRFF 


5694 


3 


1338 


GS KE P ARSLHRRGSGHKS S AGKWGSVTLS TAGALG * KQLHQ* WT " 
QRCL\NMLSSEEFNASSSLNSLPSTPTASRRNSTIVLRTDSEKR 
SliAESGL-SWFSF^EEKAPKKLEYDSGSLKMEPGTSKWRRERPES 
CDDSSKGGELKKPISLGHPGSLKKGKTPPVAVTS PITHTAQSAL 
KVAGKPTCKATDKGKI>AVKNTGI^RSSSDAGRDRLSDAKKPPSG 
IARPS TSGS FGYKKPPPATGTATVMQTGGSATLSKIQKSSG I PV 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 

to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
cor re sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A= Alanine, c=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I^Isoleucine, Lysine , 
L= Leucine, M-Methionine, N=Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine , T=Threonine, V=Val ine, 
W -Tryptophan , Y= Tyrosine, X= Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








KPVNGRKTSLDVSNSAEPGFLAPGARSNIQYRSLPRPAKSSSMS 
VTGGRGGPRP VSSS I DPSLLSTKQGGLTPSRLKEPTKVASGRTT 
PAP VNQTDREKEKAKAJCAVALDSDNI SLKS IGSPESTPKNQASH 
PTATKLAEL P PTPLRATAKS FVKP PS L ANLDKVNSNS LDLPSSS 
DTTQCI 






JLmt J O 


GSKEPARSLHRRGSGHKSSAGKWGS VTLSTAGALG* KOLHO* WT 
QRCL\NNLSSEEFNASSSLNSLPSTPTASRRNSTIVLRTDSEKR 
SLAESGLS W FS ESBE KAP KKLE YT5SGSLKMEPGTS KWRRBRPE S 
CDDS S KGG ELKKP I S LGHPG S L KKG KTP P VA VTS P ITHTAQSAL 
KVAGKP EG KATD KG KLAVKNTGLQ R S S SD AGRDRLS DAKXP P S G 
IARPSTSGS FGYKKPPPATGTATVMQTGGSATLSKIQKSSGI P V 
KPVNGRKTSLDVSNSAEPGFLAPGARSNIQYRSLPRPAKSSSMS 
VTGGRGG PRP VS S S IDPSLLSTKQGGLTPSRLKEPTKVASGRTT 
PAP VNQTDRE KEKAKAKAVALDS DNI SLKS I GS PESTPKNQASH 
PTATKLAELPPTPLRATAKS FVKP PSLANLDKVNSNS LDLPSSS 
DTTQCI 


5696 


3 


^ 1 T O 


QRCL \NNLSS EEFNAS S SLNSL PSTPTASRRNS TI VLRTDS EKR 
SLABSGI^WFSESEEKAPKKI^YDSGSLKMEPGTSKWRRERPES 
CDDS S KGGE LKKP I SLGH PGSLKKGKTP PVAVT S PITHTAQSAL 
KVAGKPEGKATDKGKIAVKNTGLQRSSSDAGRDRIiSDAKKPPSG 
IARPSTSGS FG YKKPP PATGTATVMQTGGS ATLS KIQKS SG I PV 
KP VNGRKTS LDVSNSAEPG FLAPGARSNIQYRS LPRPAKSS SMS 
VTGGRGGPR PVSSS IDPSLLSTKQGGLTPSRLKEPTKVASGRTT 
PAPVNQTDRE KEKAKAKAVALDSDNI S LKS IGS PESTPKNQASH 

DTTQCI 


5697 


1147 


47 


PS EALS PPACP SAP AP RRS 1 1 SRL FGTS P ATEAAP P PPE P VPAA 
QGPATVQS VBDFVPDDRLDRS FLEDTTPARDE KKVGAKAAQQDS 
DSDGEALGGNPMVAGFQDDVDLEDQPRGSPPLPAGPVPSQDITL 
SSEEEAEVAAPTKGPAPAPQQCSEPETKWSS I PAS KPRRGT APT 
OTJWIPPWPGGVSVR'nGPEKRSSTRPPAEMEPGKGEQASSSESDP 
EGPI AAQMLS FVMDDPDFES EGSDTQRRADDFP VRDDPSDVTDE 
DEGPAEPPPPPKLPLPAFRLKNDSDLFGLGLEEAGPKESSEEGK 
BGKTPSKENIO<KKKKGKEEEEKAAKKKSKHKKSKDKEEGKEERR 
RRQQRPPRSRERTAA 


5698 


2 


666 


GAEAAEPQEDLPPLSQSSRFFQEQQKMNKSLGPVSFKDVAVDFT 
QEEKQQLD PEQKI TYRDVMLENYSNLVS VGYHI IKPDVISKLEQ 
GEEPWIVEGEFLLQSYPDEVWQTDDLIERIQEEENKPSRQTVFI 
ETLI*R/ERGNVPGNTFDVETNPVPSRKIAYTHSLCNSCER\GF 
NASSBYISSDGRYARMKADECSGCGKSLI^IKLEKTHPGDQAYE 

FNQ 


5699 


2 


1448 

• 

• 


RVRQPPGLWVRRTVPAMQCPAGLSRVPGVAG/ DPSLPS FRGPRD 
EAAHRGTIOTARHTRKLYVQGPASGPPLPRVSTQVAI *DEKPLA 
RPS/GRTNAPFPQGQKPAGKAAPGPAAAGRVAMR\PGHPGLLAS 
DSQRSSSKGSGWETP VPWS * AQPGWVSGLLLLGDPSGPGSL* RS 
TWLVGGARGPEGSGVRGSGWPSGCSDIGWALAGWNHS *HLDPNT 
WTQKWTGE / SPAPGEEG\VAPAPRGPTAEHGHCELTTESQYSNN 
VPILFQNPSGALRS RRTEPAG WVP PTRHE * DDG* TAAPASGGAP 
VSTPTWAGTP/LNASLGPTDPQGKPGCRPPCALPKPAGPERSA* 
GGSLGCR/SMLPASSGPPPAPGPRRIiAAGAHTSASARCPPAAAA 
GWQ PRRPGFAGRAAL PGP PHP PSS *RELGGLPGPGW * TLDPLPA 
HPAHPPGSAP PWGALGG WAAARASLPWS PSLCLSFPAVTPVAGL 
FPPGRG 


5700 


923 


597 


NGHKGVWE INI Y * RRSN IHKNS KS ESHLNQDHS FP P PTPNS ARS 
KliHSTGTAKNTGLPLSG APRQRAVFSGRT I CQE FSS CLQ CAYLD 
E* CS IASSLI KAILRVS VLSE 


5701 


59 


410 


" i" FE KICSDTQE F I S PE INPQ I CS WL I FDKGAK/ NHATGKDSLFN 
KW S W KNWLSTCR* MRPGP YFTP YTKINS K * I K / DANIRCETVKL 
LEENTGENLHDTGLGNVFLDMTPKTQPTKQK 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corre s ponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
1 nucleotide 
location 
cor re spondi ng 
to first 
amino acid 
residue of 
amino acid 
sequence 


Aaino acid segment containing signal peptide 
(A«=Alanine, C=Cysteine, D=Asparcic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, l=Isoleucine, K= Lysine, 
L=Leuc ine , M=Me thionine . N=Asparagine . 
P= Proline, Q=Glut amine, RssArginine, 
S=Serine , T= Threonine , V= Valine, 
W=Tryp t ophan , Y=Tyrosine, X-Unknown, *-5top 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 


5702 


. 3 


1517 

• 


ETFVDPSQCGGIPSDSPHPVITPSRASESSASSDGPHPVITPSR 
ASESSASSDGPHPVITPSRASESSASSDGLHPVITPSRASESSA 
S SDGPHPVi TPSRASESSASSDG PHP VI TPSRASESS ASSDGLH 
PVITPSRASESSASSDGPHPVITPSWSPGSDVTLIAEALVTVTN 
I EVINCSI TE I ETTTSS I PGASDTDLI PTEGVKASSTSDPPALP 
DSTEAKPHITEVTASAETI^TAGTTESAAPHATVGTPIjPTNSAT 
ERE VTAPG ATTLSGALVTVSRNPLEETSALS VETPS YVKVSGAA 
FVSIEAGSAVGKTTSFAGSSASSYSPSEAALKNFTPSETLTMDI 
TTKGPFPTSRDPIjPSVPPTTTNSSRGTNSTIAKITTSAKTTMKP 

ptatpttartrptt\a*vqvkmevssscxs*vwlprktsltpewq 
kg * cssstgns tptrltsrsp y cvsg eang / psaaarhvp yakr 

GCCP* PGPPP^DCSCVTVLRGTQKVPMKGSMSKPLTPDVATCPS 
LTSTGVYWGGASPVPRGVLGLTIJVHVLCFS KEKT 


5703 


14 


1117 


HHKDSRSQGI*PRTQECARPELRPIjIjCPRALWPVTRLSYRCPWQA 
PKAG IGTKAKPSESHLKLHPGWPSLDRQGEPATLGTGTGHCSDS 
RILRWHP*HTAAR* PRWRRI*PSSHRWTRHLGVLRVQDKS * * VSL 
DPSCRPRFLRTC* * YGMRSVASS SNP PPGWSGPG ASVFPARPVS 
ALPTGP RC W * APRGRTRQPCGWPRLS S PHATADWGPGCP LS PSR 
GSWETAPGS * WCPWL* AARWTGWRTASGASAGLGRAADRPS AWA 
RRVAGLIiPGQGLTVRR*H* TAGAPAS VRS SQGATRS PAPGGDQ C 
A0GRGPGS C* HPPPWPVSPSSPVPCPSGR *HLRGPLLSAARPRA 
AGWPRHSPHDTQTPEP 


5704 


23 


562 


GDYEFDSPYWDDISQAAKDLVTRLMEVEQDQRITAEEAISHEWI 
SGNAASDKNI KDGVCAQ I EKNFARAKWKKAVRVTTLMKRLRAPE 
QSS TAAAQS ASATDTAT PGAAGGATAAAASG ATSAPEGDAARAA 
KSDNVAPRRP * LPPQPQMEVPPQPLMAVSPQPPMEASIjQPLMGE 
SPQP 


5705 


23 


562 


GDYEFDSPYWDDISQAAKDLVTRIiMEVEQDQRITAEEAISHEWI 
SGNAASDKNI KDGVCAQ I E KNFARAKW KKAVR VTTLM KRLRAPE 
QSSTAAAQSASATDTATPGAAGGATAAAASGATSAPEGDAARAA 
KSDNVAPRRP * LP PQPQMEVPPQ PLMAVS PQ PPMEASLQPLMGE 
SPQP 


5706 


1161 


510 


QLGRFXAQDT VA I RKVKEV FGTG AMRH W I LFTHKED* GGQALD 
DYVANTDNCSLKDLVRECERRYC^FNNWGSVEEQRQQQAELLAV 
IBRLGREREGSFHSNDLFLDAQLLQRTGAGACQEDYRQYQAKVE 
WQVEKtlKQELRENE^NWAYKALIJlVKHLMLLHYE I FVFLLLCS I 
LFFIIFLF 


5707 


28 


609 

* 


GSPAPTPGFRRRPGRGTPSPGTRHHQGRAEPEPDAPERAPLRR* 
MEAI QPGLiAEGGQFLGDP PPG LCQ PELQ PDSKSN FMASAKDANE 
NWHGMPGRVEPI LRRSSSES PSDNQAFQAPGS PEEGVRS PPEGA 
EI PGAEPEKMGG AGTVCS PLEDNGYASSSLS I DSRS SSPKPACG 
TPRGPGPPBPLLPSVAQA 


5708 


44 


1925 


SFSWEETISPCFPIOMPAEPWWLSPVSLGAAGWPGQPRPYLDLPA 
QASVSRPHDRA*GEAVSLSI*SSGDVCGHTDGGGAGSDPQAKPKP 
PRCPFTAMP5PRTKQKVRNKVCLLIAIRYSDIPSDVSKAP\GPA 
GNPHDRSSTAA* LHRRAGAGSLCLSASLLPPSFSLGAPGAPSPL 
RVSPASGGPRKEGRQGSGG * AGGGGP \ARTHADL PCVGFVCS PP 
LLK* SDS PVKQLPA\SGQGSGAGMPPVGSSDILRPRPTSVSGTG 
RAAG*CSWQPAACCTPRSQ*WAVARSPSRCSRW*RQSGR*RG*S 

LEPSGPTSGSAL* TWASHSTGA* *SRLCGTAGTGPLCSQSSRS * 
AG* RCCCTAASPCGGSGPSHPGS PSAHCLSWSGGRTQPRAPSAH 
G^RAMGSRCVCTCTGLPCPGIPLSGASPGGSGETGAGRSHTLK 
AARSRLS PRPGSGSRGS Y* SHNDNWGTWPAPPSAGKLLVGG *NS 
QRTSSDH * YTGTRRPWAGPGTRCSTAPSRAAPPVSRCRPPPPPP 
PPRPPRLPAAAS/SGGASGSPAASCSCSCRAPAKPASS/GEAPA 

PPPRPEPPPPPARRP 


5709 


2 


2C31 


ITLCPLPQTEKCIiNVWEAATPIjGIYLKARVEAGGLKEIjEISWG 
IiHQ rVTVRWGAVVMRAGMGGCRC/JGVMAP FAPR/NAI»S FLVNDCS 
LI HNNVCMAAVF VDRAGE WKLGGLD YM YS AQGNGGG PPRKGI PE 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
cor r e s ponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
CA=Alanine, C= Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K^Lysine, 
L= Leucine, M=Methionine, N=Asparagine, 
P= Proline, Q=Glutaraine, R=Arginine, 
S^Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine , X=Unknown , *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








LEO YDP PELADSSGRVVREKRSADW"JRLGCL I W EVFNGPLPRAA 
AL^RNPGKIPKTriaVPHYCELVGANPKVRPNPARFLQNCRAPGGFM 
SNRFVETNLFLEEIQI KEPAEKQKFFQELS KSLDAFPEDFCRHK 
VLPQLLTAFE FGNAGAWLTPLFKVGKFLSAEEYQQKI I PVWK 
MFSSTDRAMRIRLL^QMEQFIQYLDEPTVNTQIFPHVVHGFIJyr 
NPAIR£QTVKSMLLLAPKLNEANIiN^ I 
RCNTTVCXjGKIGS YLSAS TRHRVLTS AFSRATRD PFAPS RVAGV 
LGFAATHNL YS MNDCAQ KI LPVLCGLTVD PEKS VRDQAF KAI RS . 
FLSKLESVSEDPTQLEEVEKDVHAASS PGMGGAAAS WAG WAVTG 
VS S LTS KL I RSHPTTAPTETNI PQRPTPEGVPAPAPTPVPATPT 
TSGHWETQEEDKDTAEDS STADRWDDEDWGSLEQEAESVLAQQD 
DWSTGGQVSRASQVS\TPTTNPPNPQSPTGAAGK\RGLLGTGLA 
GAKLPGATS * RYTAGQRV 


5710 


1 


562 

« 


IPGST I SCEVELMARMAKT IDSFTQNQTRLWI I DGLDACEQDK 
VLQMLDTVRVLFSXGPFIAI FAS D PHI I IKAINQNLKSVPSGFK 
\LNGHD YMRN I VHLPVFLNSRGL /RQ / LQENFS * LQQQMETFHA 
Q I LQG YR KKLT E E FHRTALGR * QNLVARQPS I DG * DAI G FEL YV 
CIAIQFNTNKDDAT 


5711 


1526 


1130 


RRHPFQV3TTVTQEAFSHHDVAFTSTPVLFYPDSAQPFIVKSESS 

SQIAKAVLSQQRPSLFHECAFHFFS * SLQRHTINLDQGIF+LLM 
LSEERQHLFESS/IWTTPHNLK*/FEIHEH1U^SHEGHWTLFFLL 

OIL 


i> /xz 


■a 


1391 


GRKLFQSLDISERLKFLLTLDCVDDTLI VLAEEHGCLDI I KEi,P 
ETVI DI tT iNKCLTFHPSKRPTPDELMKDKVFSE VS PLYTPFTKPA 
SLFSSSLRCADLTLPEDI SQLCKDINNDYLAERS I EE VYYL WCL 
AGGDLEKELVNKEI IRSKPPI CTLPNFLFEDGES FGQGRDRSS / 
TFR* YHWD IWMPAKK+ IERCWGRS ILP X TLKMTSLILPYSNSN 
NELSAAATLPLI IREKDTEYQLNRI ILFDRLLKAYPYKKNQI WK 
EARVDI P P LMRGLTWAALLGVEGAI HAK YDAIDKDT P I PTDRQ I 
EVDI PRCHQYDELLSSPEGHAKFRRVLKAWWSHPDLVYWQGLD 
SLCAPFLYLNFNNEALVYACMSAFI PKYLYNFFLKDNSHVI QEY 
LTVFSQMIAFHDPEI^NKXNEIGFIPDLYAIPWFLTMFTHVFPL 

HKI FHLW \DTLLLGEFLFP ILYWE 

PVCAVPVDRWPVLPREDQEGQQL* AKLPRDFRR* FQILGPMEGH 
T ACRCS RRG AQVQHL PRED I RAAE * D PHLREVW PGL PTSS ATS P 
* RAVLTS PCSHLG5 ADAASSHWLCGVS FH 


5713 


534 


284 1 


5714 


212 


613 


WGLGLGPTMSSLGGGSQDAGGSSSSSTNGSGGSGSSGPKAGAAD 

KS AWAAAAPAS VADDTP P PERRNKS G 1 1 S E PLNKSLRRSRPLS 
HYSSFGSSGGSGGGSMMGGESA0KATAAAAAASLLANGHDLAAA 

MA 


5715 


131 


1979 


ESASQQKRSKCLILTLKLELSGSAPKKTSARPGSSLWLPPHSQE | 
QTPPASKLQGGGGGLQTGWGLHPVPVTAAS PLPRWCLFGAVAK\ 
GLPGP *LCPSGAA/GGLQRGPGLS PLGAAGKVSCLHPPSMVENN 
DS TCHEHHEGI LAARVTPVP \ SG KPGR VLKP PGRVCR PPHPAAS 
PRPPGS/SDLDGPRPQMHLRAFPAAHGGPVOTPHGGEEKTFMSS 
QIRRKETKPL*RKTPAG\NNYQSNSIPVSQSPQLTVI>LLPSAGR 
TQAPSGRGDAGKPTPGHG\LPKASVILTPNCPCSLAGGQ*PPGL 
YPKTPKQRRWRRPL/LLGPSQ*GSRQSTC*EV\GALGEPVRIPG 
L* PDLS C I LS NGSKHRREGLS FPRSLG PGRRG PAGLQSLGCS PT 
PKNTACHS S GHVALQAG HDS ARD VGSGHVALQAGHD S TQDVGR P 
VWRW1PLE * LGLSRETGQATRRGLVWIS PGRAAAACVACAQALE 
EG PLRLPGQDRGAQ PCSHCPGRAAGQP EPGAG APCRE / GG * DPT 
GLT/GWGTDPKRGGRKPGQSGQETQGPTWSGPESPLQPKP*E 
RQE /VGAGASSGVGLSRGRAGGPSSAWEVAAMLLLLRHGSHSEL 

TDLTEAQTSQH 


5716 


1711 


1370 


RVFSLLCEGPGHCYQGAVCREACAAASPGLDSAAEPHRLCEHTD 
*LPK*GPGYIQHFHCDSNILCILYNISFNLFSYSF*GVARYAC* 

RCPLVL* SGFFTIIVGGYSCCMPLKT 


5717 


44 


1489 


LPTEALRESEWVSEYGKCGPRGLVPEGESTSPLPSSVDTEDSLD 
EGPGALVLES DliLGQDLEFEEEEEEEEGDGNS DQLMG FERDSE 
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SEQ 
ID 
NO: 


\ Piredicted 
beginning 
nucleotide 
location 
correspond! ng 
to first 
amino acid 
residue of 
amino acid 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A--= Alanine , C= Cysteine , D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=sIsoleucine, K-Lysine, 
L= Leu cine , M= Methionine, N=Asparagine, 
P=Proline, Qs=Glutaraine, . R=Arginine, 
SoSerine, TVThreonine , VssValine, 
WsTryptophan, Y= Tyrosine , X=Unknown, *=Stop 
Cod an, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








GDSLGARPGIjPYGI^DDiSGGGI^I^AESEVEiPARGPGi^ARGE 
RPGPACQLCGGPTGEGPCCGAGGPGGGPLLPPRLLYSCR1.CTFV 
SHYSSHLKI^HMCm^GEKPFRCXSRCPyASAQLVNLTRHTRTHTG 
fc.KP YRCPHCP FACS Si/jNIiRRHQRTriAGP PTP PCPT CG FRCCTP 
RPARPPSPTEQEGAVPRRPEDAL.LIjPDLSLir/PPGGAS FLiPD CG 
Q\CGVKGRASAGLIX}NHCX?S/SLFPWTCRGCGQELEEGEGSRXiG 
AAMCGRCMRGEAGGGASGGPOGPSDKBFACSLCPFATHYPNHIA 
RHMKIHSGEKPFliCARCPYASAHIjDNLKRHQRVHTGBKPYKCPI» 
CPYACGNLANLKRHGRIHSGBKPFRCSiO^SC^OSMNLIRH^ 


5718 


12 0 


284 


VAHAI ^ SI J PAESYGNDVS^mIPQLPPTQIAWDLalTCLPLSYNFT 
S* *STADPLHL> 


5719 


48 


428 


ELNNG P FQMP1»CNGGNjjAVTGS WADRS PLHKA ASQGR LLALRTI» 

LSQGYNVNAVTLDHVTPLHEACl^DHVACARTL^ 

IDGVTPI*FNACSQGSPSCAELLLEYGAKAQP\ESCLPSP 


5720 


1 


1051 


LQAFRNAS EVPMVLVGTQDAI S AA \N PRVYRRTSRARKLSTDLK 
\RCT\YYE\TCGGTYGLQMWSVSFQDVAQKWAl*\RKKQQ\liAI 
GPCX\SLPN\ SPSH\SAVSAAS I PARAPINQGHE/SGGGSAFSD 
Y\SSSVPSTPS ISQRELRIETIAASSTPTPIRXQSKRRSNI FTS 
RKGADP\DREKKAAGCKVDS IGSGRAI PI KQGI LLKRSGKSLNX 
EWKKKYVTIjCX>NGULTYKPSLHDYMQK1HGKEIDIJ^TW 
KRLPRATPATAPGTSPRANGLSVERSNTQLGGGTGAPHSASSAS 
LHSERPI*SSSAV7AGPRPEGLHQRSCSVSSADQWSEATTSLPPGM 
QHPASG 


5721 


97 


492 


RHS S PC CSLRRTERSSNAAVST / TT VQQFKRF I ENYRRH I G CVA 
VFYA I AGGL FL»ERAYYY AFAAHHTG I TDTTRVG 1 1 LSRGTAAS 1 
SFMFSY ri^TMCRNLITFLR£TFL.NRYVPFDAAVDFHRLIASTA 


5722 


88 


1043 


VALDVLAGS S PGGGMAGALLG PRVHG I RAVLRVARGGVQAPGAP 
GSLGVSHAAAPPARPQGAAQSPHRGRR^GGGGAGLPPPRS PRFP 
QESVPASTSTARGPRRVSRRLPPQHPGPRGRRRRPGAGVGAPRR 
GRARGQAGLI^RQGQGGRGASRERAAIiQARRGRRPGPEPDQS CG 
GRPRRAAAAPG RAPADPQ P P APR P AP A?D VRP P ADAP AP AP A PA 

PP PP PHI*GAI*TAGS GEERQS QPRAETLRLGRGAPliP \ PRAERGG 
RPKQAEQQQ\PKRPTPPARGPQSSGDPAMLPQRAGLRTGGLAGT 

KSSTREIPEMI 


5723 


88 


1043 


VALDVLAGSS PGGGMAGALLGPRVHG I RAVLRVARGGVQAPGAP 
GSLGVSHAAAP PARPQGAAQSPHRGRRHGGGGAGLPPPRS PRFP 
QESVP ASTSTARG P RRVS RRLP PQHP G P RGRRRR PGAGVGAPRR 
GRARGQAGI^RQGQGGRGAERERAALQARRGRRPGPEPDQS CG 
GRPRRAAAAPGRAPADPQPPAPRPAPAPDVRPPADAPAPAPAPA 
P PPP PHLGALiTAGS GEERQSQPRAETLRIiGRGAPLP \ PRAERGG 
RPKQAEQQQ\ PKRPTP PARGPQS SGDPAMIiPOjRAGLRTGGLAGT 
KSSTREIPEMI 


5724 


3 


1641 1 


FTNEAPPAPLPDASASPLSPHRRAKSLDRRSTEPSVTPDLLNFK 
KGWLTKQYBDGXJWKXHWFAIiAD^ 

LSACYDVTE YPVQRNYGFQ IHTKEGEPTI»SAMTSGIRRNW IQTI 
MKHVHPTTAPDVTSSLPEEKEJKSSCSFETCPRPTEKQEAELGEP 
DPEQKRSRARJE \ RRREGRSKTFDWAE FRP I QQALAQERVGGVGP 
AiyraXDPWRPEAEHGELERERARRREERRKRFGMLDATDGPGTE 
DAMJWEVDRSPGI*PMSDI,KTHNVHVE IEQRWHQVETTPLREEK 

NRUjQDQLRVAiGREQSAREGYVLQATCERGFAAMEETHQKKI E 
DLQRQHQRELEKJLREEKDRLLAEETAATI SAI EAMKNAKREEME 
RELEKSQRSQI SSVNSDVEALRRQYLEEIjQSVQREI*EVLSEQYS 
QKCtiENAHIJ^AIiEAERQ ALRQ CQRHSQE LNAT^QBLNNRIiAAE 
ITRLRTLLTGDGGGEATGSPLAQGKDAYEIiEVPSGARPCLTQLC 
TQEPQGSAAWPLS Y R WGGTDLRQQESQG PGRS KSPEGGEEQ 


5725 


3 


1049 


VNGHSEETSQS PNRTEPHDSDCS VDLGI S KST3DLS PQKSGPVG 
SVVKSHS ITNMEIGGLKI YDILSDN\DLS SHLQPLK/ FTSAVCG 
IXNIVRSKAATLLYDQPIiQVFTGSSSSSDIj ISGTKAI FKFDSNHN 
P E / GAK YN KR P HKWAHNIiHXiKYMVLHS 1 1 SNTVAV \ RS QRHFVA 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
annuo oliu 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
lo urst 
amino acid 
residue of 
amino acid 

e; o <n i ^ n r*f* 


Amino acid segment containing signal peptide 
lA^Alanine, C= Cysteine , D^Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine , G«=Glycine, 
H=Histidine, I=Isoleucine, K^Lysine, 

P= Proline, Q=Glut amine , R=Arginine, 
S=Serine, T=Threonine, V= Valine, 
W=Tryptophan, Y=Tyrosine, X=UnJcnown, *=Stop 

rodnn / = nncs i HI nnrlpnt'irfp r?P»l f*F" "i nn 

v-\«*v*v^**, / -yuooiuic UU^lCUulUC ucjici>xuii| 

\=possii>le nucleotide insertion) 








LOT KS PNRPCOFS SSAPS / VDORAO / INQS YAKHSANMNFSNHN 
NVRANTAYHLHQRIX3PARHGEMWAI S PNDRLI PAVTRSTIQRQS 
SVSSTASVNI^DPGSTRRAQIPEGDYIjSYREFHSAGRTPPMMPG 
SQRPLSARTYS IDGPNASRPQSARPS INEI PERTMSVSDFNYSR 
TSP 


5726 


2 


486 


; SRSUS^WWNSGLPASSHSSKJiPVTVGFSGCVrKPO^RI^GRPLGAP 
TRMAGVTPCILGPLEAGLFFPGSGGVITX/ESVGAGI PGPSRAG 

• AfCDfVOrPrtDDI CODCrtQT nnnT O^&TT O P»T7^2 T ITT CTTODT at 7T* 
' UUo ruliou&ur rlob r r ul'AUlii'unl brU Voliciij&VKr JjAV X 

GLI FHl^QARTPPYLQLQVTEKQVltLRADDG 


5727 


21 


221 


RPILILKETRRLPWATGYAEVINAGK^THNEEXJASCEVIjT^/KKK 
AGAVTSTPNRNSSKRRSSLPNGE 


5728 


2 


877 


GTRNGQFEPRRGRAWEGSAGGLRAPGAAAGGPGVQPRGSG/LPG 
NAIRAGVNPGRGPASPFWDLSLPRDLWPPPTDHAPGAPDFPAVE 
GR\ PWAGGRPPWpVSGVLGSRVCGPLYSTSPAGPG/SGGLS PSQ 
GGPAGAGGDAG/LPGRCPSAPWRAGSRPAASCPDWIPGPQGLWL 
HRNPTS/GPPSQIGEGAEQGDBGVADAPQIQCKN/GAEDPPAED 
EPPQVPEAGEEDAVPAEEGPGGTPETQADQVRERPEAHLAEGGA 
KG S PRRXiADPQDLPAGQMS LAPP FP PVAAVI RSNK 


5729 


1 

• 


1525 


AGG ARE VLTLQ LGHFAG FVGAH WWNQQDAALGRATDS KE PPG EL 

CPDVLYRTGRTI^GQETYTPRLII>MDLKGSLSSLKEEGGLYRDK 

QIJ^AAIAWQGKLTTHKEEI^YPKWPYLQDFLSAEGVLSSDGVWRV 

KSIPKGKGSSPLPTATTPKPLIPTEASIRVWSDFLRVHLHPRSI 

CmQKYNHDGEAGRLEAFGO^ESVLKEPKYQEEL 

CDYLQGFQILCDLHDGFSGVGAKAAELLiQDE YSGRG I ITWGLLP 

GPYHRGEAQRNIYRIil^TAFGLVHLTAHSSLVCPLSXJGGSLGLR 

PEPPVSFPYIjHTOATXPFHCSAXIiATALDTVTCSNYRLCSSPVS 

mvhl \admls fcg kkvvt ag a 1 1 p f plafgqs lpds lmqfc»{*a1 

PWTPLSACGEPSGTRCFAQSWLRGIDRACHTSQLTPGTPPPSA 

LHACTTGEE I LAQYLQQQQ PGVMSS SHLLLTPCRVAPPYPHLFS 

SCSPPGMVLDGSPKGAAVESVPVFG 


5730 


1258 


1713 


KKFQ AP ARETC VE CQKTVY PMERLLANQQVFHI S CFRCS YCNNK 
LSLGT YASLHGR1 YCKPHFNQLFKS KGNYDEGFGHRPHKDLWAT 
KIETEGFWERPRNFENCGR PLKS PGGEDCPS C* GG CPGSN Y * AQ 
GSSSREKGGQASWNPKLRVA 


5731 


122 


443 


RSHRGEL1 P KDSCYMRKPP RRP KKRRQG / CAL PQG CLTFKDVA I 
EFSLEEWKCi^FAQRALYKAVMljJbcJ iKWbciovoiji ^rUJoVS i nxciv. 
KPGRGRGKQRRQEWFFURVY 


5732 


226 


772 


PPSRSCQSPRRKSRRRAHVTVTLVCGFTSFSFSLPLYIjCGCIiRF 

PERTCS QLQQADWAPDFGPS S FVPS WGATATGARKFLIAFNI \N 

IJjGTKEQAHRIALNLREQGRGKIX^PGRLXKVO^ 

Q VSTNLLDFEVTALHTVYE ETCREAQELS LP WGSQLVGLVPLK 

ALLDAA 


5733 


1 


460 


PAIjQE VNA^iALAWGKQ YENDARTLF E FTSG VNDT ES P 1 1 YRDES 

MDtpS PCDTVT J^QTif'KtnT .T7T.Vr , DT? r PQ15mi , MJfTrPT^If5'P r PEl.TTf ^ JiVM 
Nr. i. rv_ it* r 1 11 ^ 1 " .^iHiMiii [P r f ii\\.-y p n lonur I'ICvC XvuUrV7f DivlIwAxn 

AQVQYSMWVTRKNANYFANYDPRMKREGLHYVVI ERDEKYM\AS 
FDEI\VP\EFIGKMDEVLSRDPM 


5734 


3 


968 


RCNS PES LTSLLVLLTTANNLFVL.I PAYS KNRAYAI F?I VFTVI 

GSLFLMNLLTAI I YSQFRGYLMKSLQTSLFRRRLGTRAAFEYLS 

S^WGEGGAFPQAvGVKPQNLIJQVI^KVQLUS&l^Ay 

fiSVtliSAEEroKLFtreLDRSVVXEHPPRPEYOSPFLQSAQFLFG 

HYYFDYIjGNLIALANLVSICVFLVLIIADVLP 

VFIVYYTJiRMIiIJWAIK3LRGYLSYPSNVFIX3IjI*TV^ 

TL\VCTI)CHTQAGGRRWW/RI^SI^DMTRMI^MLIVFRFDRIIP 

SMKPMAWASTVLGL 


5735 


2 


540 


FFTPCVARAFNF PDQATVKKAAYSLPRVGGGTS CGLPQARR I SL 
ATPRQLYK/SSNMTQRWQRREISNFEYI^MFLNTIAGRTYNDLNQ 
YPVFPWVLTNYESEELDI*TLPGNFRDLS KPIGALNPKRAVFYAE 
RYETWEDDQSPPYHYNTHYSTATSTLSWLVRIVS I F IELACLWY 
* LXZLT 


5736 


1 


382 


GTRPSTKKSG YSPQQ VAVIH CXGHQKENTAVAHSNQ KADSAAQV 



365 



BNSCOCID: <WO. 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue ox 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
ammo acia 
sequence 


Amino acid segment containing signal peptide 
lA=AJ.anxne, c=cysceine, u^Aspart ic acia, e.= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I = Isoleucine, K= Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P=Proline, QsGlutamine, R=Arginine, 
S«=Serine, T= Threonine, V=Valine, 
w— i lyptopnun , i-iyrosijic/ a— uiuvnuwii , -otop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








TARI^SVTPPNLLPTVS FPQPDLPDNPV YSTTTE KJLAS DLRANKN 


5737 


290 


1041 


KACLHLLS S FLTSNFLFNPLIi PDSLYS VEARSQRANltG PCRRKR 
LQTLMRLAAG FQYSSH KDPS IiSAKEKKTD YHNEARG P WPGWVG * 
RTAIXSSCGRGPDGAHHPGPKSSSWRASRIiliPGLGGSHHLDAYVG 
RDLECGTPAPLQLE I P PQ PRGHPAP I PTGOAG PRDS G PG AS P* V 
BTRPLTDGRR* PGVRPVG WTPAHPAGTLRPRGAVE PSVS ACGKW 
APSPTSQGCCEGRCDAVPKHRAWRTPLCSQ 


5738 


8 


460 


DTLSLNCTLPETLPMTPS F* LSFL* FPGIxARAKSIPTKTYSNEV 
VTLWYRPPDIIjLGSTDYSTOID[W*GQVEVWCK3PCGKGGGI>VTT 
ATQPAAFIiFTVPSDPRGVGCI FYEMATGRPLFPGSTVEEQIiHFI 
FRI-^EEAWALiCAVETHR 


5739 


1 


1222 


S FQRRG I RWNVHTliH PH P RAV WAG IGRGHGS * ALLG RARAPALC 
FPTLLEFLESI^PDI,PAI*RAMGLHI*WAAGPGTHPAGI SDIiLAEV 
SAFA/IXjPVPGYLSSPQSITlXrCliYIFTSGTTGLPKAARISKLKI 
LQCCGFYQLVCGVHQEDVI YLALPLYHMSGS LLGIVGCMG IGATV 
VLKSKFSAGQFWEDCQQHRVTVFQYIGELCRYLVNQPPSKAERG 
HKVRIAVGSGLRP DTWERFVRR FGPLQVLETYGLTEGNVAT INY 
TGQRGAVGRASVrt>YlCHIFPFSLIRYDVTTGEPIRDPQGHCMATS 
PGEPGIjLVAPVSQQSPFI*GYAGGPELAQGKIjIJCDVFRPGDVFFN 
TRDLLVCDDQGFLRFHDRTGDP FRWKGENVATTEVAEV FEAU3F 
LQEVNVYGVTV 


5740 


265 


231 


PAYWLKVPTLCiESKTDl.REKASHVSAQM^EVRGIiAGAliWM*A 
YVYERVYN*NISRMVHALEQKRHPAGDSSSMALQLNPCLGMIiMA 
LQSELHKLYBEETQSWVSGSACGGYP 


5741 


1 


650 


PRKTMRRGVLMTLIiQQSAMTIjPLW IGKPGDRPP PliCGAI PASGD 
YVAR PGDKVAARVKAVDGDEQW I JUAE \Ar £> Y£>HATWK.YJ£VIjDXDE 
EGKERHTL»SRRRVIPIiPQWKANPETDPEALFQKEQLVLALYPQT 
TCFYRALIHAPPQRPQDDYSVI.FEDTS YADGYS PPLNVAQRYW 
ACKEPKKK*CRltADSPSPNDTGQDSRGRAGIKHIPPLKKK 


5742 


2 


362 


TQSVKEILKRNPNVNIjTDKDGNTAJLMIASKEGHTEIVQDLjLDAG 

TYVNI PDRSGDTVEjI gavrgghve i vrallqkyadi dirgodnk 

TALYv^VEKGNATMVRL)l JjQCNFD I r. iLi KIAj 


S743 


2 


415 


GKTPEG I DA I EE I E I D1»EETEREI S PQENGLE E VKPLGEMQTDL 
KATGRE I SPREKTPEVIDATEEIDKDLEETGRREI s peengpeb 
VKPVDEMETDLKTTGPJEGSSRFKTREVIDAAEVI ETDLEETERE 
ISPQE 


5744 


3 


703 


TRRTTTTSPTTTRQMTT^PAALPTTWTTPDLTTGTPI>2MTTIA 
VFTTANTCLSLTPSTIjPEKATGI»IiTPEPSKEGPILTAESETVI.P 

PQPGASDTAVPEQNKTTKTGQMDGI PMSMKNEMP ISQLLMI JAP 
S LGFVLFALFVAFLLRGKLMET YCSQKHTRL»DY I GDS KNVLNBV 
QHGREDEDGLFTL 


5745 


1400 


599 


GKSR FVNJbMKHS KKT YDS FQDELEDY I KVQKARGIiEPKTCFRKM 
XGDYLETCGYKGEVNSRPTYRMFDQRLPSETIQTYPRSCNI POT 
VENRLPQWIiPAHDSRIjRIiDSLSYCQFTPJJCFSEKPVPLNFNQQE 

EKSEEERS KHKRKKS CE E I DI»DKHKS I QRKKTEVE I ETVHVSTE 
KLKNRKEKKSRD WS KKEE31KRTKKKKEQGQERTEEEMLWTJQS I 
LGF 


5746 


3 


821 


SFASGRLTPSSPAFDGELDIiQRYSNGPAVSAWSIiGMGAVSWSES 1 
RAGERRFPCPVCGKRFRFNS ILALHLRTHQPERPRSPAARIjJLLE 
LEEPJUjLRFJVRLGRARSSGGMQATPATEGLARPQAPSSSAFRCP 
YOCGKFRTSAEPvERHLHIIiHRPWKCGLCS FGSSQEEELLHHSLT 
AHGAPERPIiAATSAAPP PQPQPQP PPQPE PRS VPQPEPE PQPER 
EATPTPAPAAPEEP PAP P E FRCQVCX3QS FTQS WFL.KGHMRKHKA 
SFDHACPV 


5747 


2 


1328 


DRHVETLCIHFi^PSTGSTAKTGGRlWIJCTGNCXiYGNTCRFVHG 
PS PRGKG YSSNYRRS PER PTGDIiRERI KKKRQDVTDTEPQKRNTE 
ESSSPVRKBSSRGRHREKEDIKZTKJEKTPESEEENVEWETNRDD 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corre spending 
to first 
amino acid 
resiaue or 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corre spoud ing 
to first 
amino acid 
residue of 
ami no acj>u 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine # C= Cysteine , D=Aspartic Acid, E= 
Glutamic Acid, F=Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
I>= Leucine , M=sMethionine, N=sAsparagine, 
P= Proline , Q=Glut amine , R=Arginine, 
S=Serine , T=Threonine, V«Valine, 
n^ 1 x — ypt opnaii / loiyrosiiiC/ a-ujiiujowii/ -otop 
Codon, /=possible nucleotide deletion, 
\-possible nucleotide insertion) 








SDNGDINYDYVHELSLEMKRQKIQRELMKLEQENMEKREEI I IK 

AVSSPLLDQQRNSKTNQSKKKGPRTPSPPPPIPEDIALGKKYKE 
KYKVKDRIEEKTRDGKDRGRDFERQREKRDKPRSTS PAGQHHSP 
ISS RHHS5 S S QSGSS IQRHS PS PRRKRTPS PSYQRTI/TPPLRRS 
AS P Y PSHSI*SS PQRKQS PPRHRS PMREKGRHDHERTSQSHDRRH 

KKKrJJl ci\j iVKJJKn. KL/bKCi£>K& I £iUUUooo KUtlK±JUJXEi rKIAaKUK 
RE 


5748 


934 


473 


SEGPQVFYKGLAPTLI AI FPYAGLQFSCYSSLKHLYKWAI PAEG 
KKNENI^NLLCGSGAGVISKTLTYPLDLFKKRLQVGGFEHARAA 
FGQVRR YKGIiMDCAKQVLQKEGALGFF KGIjS PSUjKAALSTGFM 
FPSYEFFCNVFHCMNRTASQR 


5749 


552 


1 


GPPVDPRVRGSTbSLAERPKGMIRSGSFRDPTDDVHGSVLSLAS 
S AS STYSSAEERMQSEQI RKLRRELES 5QE KVATI*TSQI,S ANAN 
LVAAFEQSLVNMTSRLRHLAETAEEKDTELLDI*RETIDF1.KKKN 
SEAQAVIQGALNASETTPKELRIKRQNSSDS ISSUSJS ITSHSS I 
GSSKDADA 


5750 


22 


866 


IFISICLWNAHLCFLIil.PKDCIDOVMKLQNLFVDDSGRYLAIQF 

IILEWAYVFLYYYEYRKAKDQIiDIAKBISQIjQIDLTGAIjGKRTRF 

QENYVAQLIIjDVRREGDVI>SN(^FTPAPTPQBHIj^ 

ILNDIKXADCEQFQMPDLCAEEIAIIIXJICTNFQKNNPVHT^ 

VELLAFTSCI^SQPKFV?AIQTSALILRTKLEKGSTRRVPJLAMRQ 

TQALADQFEDKTTSVLERLKI FYCCQVP PHWAI QRQI*ASLI*FEX 

GCTSSALQIFEKUEMWE 


5751 


3 


751 


SCGS ALRAWRCGAA z UjAT FP APALPGLMYRALYAFRS AE PNALA 
FAAGETFLtVLERS S AHW WLAARARSGETG YVP PAYXiRRIjQGIjEQ 
DVI^AIDRAIEAVHNTAMFJDGGKYSLEQRGVLQKIilHHRKETI^S 
RRG PS AS S VAVMT S S TSDHHLDAAAARQ P NGV CRAG r ERQHSij P 
SSEHLGADGGIiFQI PLPSSQI PPQPRRAAPTTPPPPVKRRDREA 
LMASGSGGHNTMPSGGNSVSSGSS VSS CI 


5752 


3 

> 


471 


GPVCGV GLSVAWAG PWRG P VHS V WKSGRA/UjtKjAJ&ij f Uiji>t» M x 
VEREMEIiRHKNEMLRVETKARARAKAERENADI IREQIRLKASE 
HRC/TVLES IRTAGTLFGEGFRAFVTDRDKVTATVN I FIKQGWQV 

AERQHVGASWS PRSCPCKLCTAL 


5753 


34 


483 


DDSXAI PGGVQAP FGAVRNI YTPRTGHRIRKLDQ I QSGGN YVAG 
GQEAFKKIjNYLDIGEIKKRPMEVVin'EVKPVlIHSRINVSARFRK 
PLQEPCTI FTjIANGDLINPASRLLIPRKTI^QWDHVTX}MVTEKI 

1 IiK.SNAVHKIjY 1 iit^«J_»V 


5754 


14 


331 


TLVHWEFAGEHAEAIASREQEVIjQGWKELI^ACEDARLHVSST 

adalrfhsqvrdllswmixsiasqigaabkprcpssli^lpaspw 
wptpatps pl.tapfsme 


5755 


3 


888 


LGDQFYKEAI EHCRS YNSRLCAE RSVRLP FLDSQTG VAQNNC Y I 

PEVELPLKKIXSFTSESTTIiEALI^GEGVEKKVDAREEESIQEIQ 
RVLENDENVEEGNEEEDLEEDI PKRKNRTRGRARGSAGGRRRHD 
AASQEDHDKPYVCTICX3KRYKNRPGLSYHYAHTHIASEEGDEAQ 

nrtPTD O D DKTHD 1MT7NTTP PnTT^PT^fl'FVTPNNYi^DFCIiGGSNMNKKS 

GRPEELVS CADCGRSAHLGGEGRKEKEAAA 


5756 


3 


621 


SSKliQAIiFAHPLY^PEEPPLIXJAEDSLIASQEALRYYRRKVAR 
WNPJ^HKMYREOMNLTSI^DPPLQLRLEASWVQFHIjG INRHGLYS r 
SSPWSKI^DMRHFPTISADYSQDEKAIAGACDCTQIVKPSGV 
HLKLVLRFSDFX3KAMFKPMRQQRDEETPVDFFYFIDFQRHNAEI 
AAFHLDRI IjDFRRVPPTVGRI VNVTKEIL 


5757 


3 


473 


YKDAIJjLPDNHRQVVFENGTUCLTDVQKGMDEGEYLCSVLIQPQ 
LSISQSVHVAVKVPPLIQPFEFPPASIGQLLYIPCWSSGDMPI 
RITWRKDGQVI ISGSGVTIESKEFMSSI»QISSVSI.KHNGNYTCI 

asnaaatvsrerqlivrvpprfvv I 


5758 


1 


474 


PRRGAGAERGEHREGERGAAGMGEFKVHRVRFFNYVPSGIRCVA 
YNNQSNRIAVSRTDGTVEIYNI^ANYFQEKFFPGHESRATSAI.C 
WAEGQRLF S AG LNGE I MEYDLQAU? I KYAMDAFGGP I WS MAAS P 
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BNSOOCID: <WO 
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SEQ 
ID 
NO: 


Predicted, 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of . 
amino acid 
sequence 


Predicted end 

nucleotide 

location 

c or responding 

to first 

amino acid 

residue of 

amino acid 

sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine # D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=lsoleucine, K= Lysine, 
L^Leucine, M=Methionine , N=Asparagine, 
P=Proline, Q=Glut amine, R=Arginine, 
S=Serine, T=Threonine , V«*Valine, 
W=Tryptophan, Y«*Tyrosine, X=Dnknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\»possible nucleotide insertion) 








oV^oUl^VL»l-i^JA«VKJjrQITFlJlVJ. fV ( 


5759 


2 


1240 


GNAAFAGQGVVYETFHMSDI.PSYTTNGTVHVVVNNQIGFTTDPR 
MARS S P YPTD VARVVNAP I FHVNADDP EAVI YVCS VAAE WRNT F 
NKDVGADLVC YRRRGHNEMDE PMPTQPLMY KQ I HRQVP VLKKYA 
DKL I AEGTVTLQEFE EE I AKYDRI CEEAYGRS KDKK ILHI KHVTL 
DS P WPG FFNVDGEP KSMTCPATG I PEDMLTHI GSVASS VPLEDF 
KIHTGLSRI I^GRADOTKNRTVDWAIAEYMAFGSLLKEG IHVRL 
NGQDVERGTFSHR>3VLHDQEVDRRTC^PMNHLWPDQAPYTVCN 
S S I*S EYGVLG FELG YAMAS PNAL VLWEAQ FGD FHNTAQ C 1 1 DQF 
I STGQAKW VRHNG IVLLLPHGMEGMGPEHS S ARPERFLQMSNDD 
S DAYPAFT KDFEVSQIj 


576C 


1 


1221 


VRD I TSDSI*S LS WTVPEGQ FDKFLVQFKNGDGQP KAVRVPGHED 
GVTI SGLEPDHKYKMNIiYGFHGGQRVGPVSAVGIjTAPGKDEEMA 
PASTEPPTPEP? I KPRIJEELrVTDATPDSI*S LSWTVPEGQFDHF 

lvqykngdgqpkatrvpghedrvti SGLEPDNKYKMNLYGFHGG 

CRVGPVSAIGVTAAEEETPTPTEPSMEAPEPPEEPLLGELTVTG 
SS PDSIiSLSWTVPQGRFDS FTVQYKDRDGRPQWRVGGEESEVT 
VGGLEPGRKYKMHLYGLHEGRRVGP VS TVGVTAPQEDVDETPSP 
TEPGTEAPEPPEEPIjIjGELTVTGSSPDSLSLSWTVPOGRFDSFT 
VQYKDRDGRPQAVRVGGQESKVTVRGIiBPGRKYKMHIiYGLHEGR 
RLGPVSAIGVT 


5761 


3 


1275 

• 


SCDMAEAAALVWIRGPGFGCKAVRCASGRCTVRDFIHRHCQDQN 
VPVENFFVKCNGALINTSDTVQHGAVYSLEPRLCGGKGGFGSML 
RALGLAQ IEKTTNREACRDLSGRRLRDVNHEKAMAE WVKCX3AERE 
AEKEQKRI^RI^RKLVEPKHCFO'SPDYC^QC^IEMAERl^SVIjK 
GMQAAS S KM VS AE I SENRKRQWPTKSQTDRGASAGKRRCFWIX3M 
EGLBTAEGSNSESSDDDSEEAPSTSGMGFHAPKIGSNGVEMAAK 
FPSGS QRAR WNTDHGS PEQIjQI P VTDSGRH ILEDS CAELGES K 
EHMESRIWTETEETQEKKAESKEPIEEEPTGAGliNKDKETEERT 
DGERVAEVAPEERENVAVAKXQESQPGNAVIDKETIDLLAFTSV 
ART.KT.T^tiEKT.KCE ■ MAT »GT»KCGGTIiQ 


5762 


2 


*"% Jk 

344 


GSTGQTPLHSOGGGGGSGGGRRRTPRGMPKEKYEPPDPRRMYTI 
MSSEEAANGKKSHWAELE I SGKVRSLSASLWSLTHLTALHLSDN 
SLSRIPSDIAKLHNIiVYDDLSSNKIR 


5763 


3 


129 


LDKDTGli I MIjIARIjD YEIiI QR FTIjT I IARDGGGEETTGR VR I N V 
LDVNDNVPTFQKDAYVGALRENEPSVTQLVRLRATDEDSPPNNQ 
ITYSIVSASAFGSYFDISLYEGYGVISVSRPLDYEQISNGI-IYL 
TVMAMDAGN 


5764 


19 


A A •! 

441 


VCARACGEMRQLJjRP IDRQR YDENEDLSD V fcifc.1 VS VRG FST iEEK 
LRSQLYCX5DFVHAMEGKDFNYEYVQREALRVPLI FREKDGLGIK 
MPDPDFTVRDVKIJLVGSRRIjVDVMDVNTQKGTSMSMSQFVRYYE 
TPEAQRDKL 


5765 


3" 


825 


QKILRLNNSHQPPTSSSNSKDCGGPASSGAGATAAIoADGLKFAS 

vqasapqgnshkets ks kvkrs kts koanks lpsaalyg i pe i s 
stgkrqevqgrpgeatgmnsalgqsvssggsgnpnsnstststs 
aatagagscgkskeekpgksqssrgakrdkdagksrkdkhdiilq 

CjUttNC^Q»oQAP&VaGHI*YGrGLftJCSN nCGGTG£>G5 V AAA 
GEVS KSAPDSGLMGNSMIiVKKEEEEEESHRR X KKLKTEKVDPL F 
TVPAPPPHV 


5766 


1608 


663 


SGT iP^VDPZlS ?OAMRT »<?71 VTT . T KRVRMH/MWACWVL T » 
wwu^ *j v ur/wwyu imii^L/ v x ju jl x^j vj v v i i v v avvj v v v jja unuvu 

AMLSTYVADSGSNQLLGAIVSAGDTSVI.HLGHVDHLVAGQGNPE 
PTELPHPSEGNDEKAEEAGEGRGDSTGSAGAGGGVEPSLKHT.LD 
IQGLPKRGJVGAGSSSPEAPLRSEDSTCLPPSPGLITVRLKFLND 
TE ELAVARPEDTVG AliKS KYFPG QESQMKLI YQGRLLQDPARTIj 
RSLNITDNCVIHCHRS PPGSAVPGPSASLAPSATEPPSLGVNVG 
SIMVPVFVAnjI^VVWYFRINYRQFFTAPAWSLVGVTVFFSFLV 

FGMYGR 


S767 


2 


892 


NFRATPRPPTRPEIiRTGTE VILWYLDVJRAIWKRKRMKAN I KLVG 
SGFPLPSSDLDDSLTEE IDEKIGFRWDANFDWQNVADFRDAGGS 
LTEVKVEEEERDPQSPE FE I EEEEEMLS S VI PDSRREN EIiPDFP 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, c=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K= Lysine, 
I*= Leucine , M=Methionine , N*=Aspa rag ine , 
P= Proline , Q=Glutamine, R^Arginine, 
S=»Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\s=possible nucleotide insertion) 








HIDEFTTIiNSTPSRSAYDEPHIiLVNIEKQyT iKT .KKRRU)IEAER 

I^VEKEM^IEKHUiRHIJ3MEHERU2LEKERM 
NSEKPSIiBNELGQGEKSMLQPQDI ETEKI»KLERERLQLEKDRI*Q 
FLKFESEKXiQ I EKER LQVEKDRIiR I QKEGKLQ 


5768 


3 


476 


SSRSRLSVSVSPPPPGIVELGPPFAWEFCSRLGSAVTSQRAGPA 
AAMVAKDYPFYljTVKRANCSIiEIiPPASGPAKDAEEPSNKRVKPIj 
S RVTS LANIi I P PVKATPLtKRFSQTLQRS I SFRSESRPDILAPRP 
WSRNAAPSS TKRRDSKI*WSETFDVC 


5769 


38 


667 


TK^KKGVKEKATDQSVKAFAEHCPELQYVGFMGCSVTS KGVIHI* j 
TKLRNIJSSIlBl J RHITEl l DNETA^IE I VKRCKNLI SUULCLNWI IN j 
DRCVEVIAiCBGQ^^JKEI J yIJVSCKITDYAI»IAIGRYSOTIETVDV 
GWCKEITDQGATIiXAQSSKSIJ^YLGIjMRCDKVNEVTVEQLVCX^Y 
PHITFSTVLQDCKl^rJjERAYQMGWTPNMSAASS 


5770 


1 


484 


DSRRYDVKTRKWSFIJjEEHSKLIAKVRCLPQVQLDPLPTTLTLA 
FAS QLKKTSIiS LTPDVPEADLS EVDPKLVSNLMP FQRAGVNF7AI 
AKGGRLIJLADDMGLGKTIQAXCIAAFYRKEWPLIiVVVPSSVRFT 
WEQAFLRWLPSLSPDCIITVWTGKDRIjTA 


5771 


168 


741 


GLLPSACIiRARSWREASEGPSSRACSNGSQDTFEACYSGTSTPS 
FHGSHCSGS DHS S LG LiEQ LQD YM VTI»R S KLG P L.E IQQPAMLLRE 
YRLGLP IQDYCTCLLKLYGDRRKFLLLGMRPFI PDQD IGYFEGF 
LEGVG I REGG I I*TDS FGR I KRSMS S TS ASAVRS YDGAAQRPEAQ 
AFHRLIiADITHD I E 


5772 


148 


383 


EF!OjALVSPSHPQIKAE1)DQP1jPGV1jLSLSGGLFRSNLI.TQDNG 
ILTFSNIiVTCSAIYHLPVFPEREPGCSMRDLRVA 


5773 


2 


723 


PRVRS KHNFC FMEMNTRIjQVEHP VTEM I TGTDIjVEWQLR IAAGE 
KIPI^QEEITI>QGHAFEARIYAEDPSNNFMPVAGPLVItLSTPRA 
DPS TR I ETGVRQGDEVSVHYDPMI AKLVVWAADRQAALTKLRYS 
LRQYN J VGLHTWIDFIiIjNLSGHPE FEAGNVHTDFI PQHHKQLLI* 
SRKAAAKESLCQAAI/SLILKEKAMTDTFTTLQAHDQ FSPFSS S SG 
RRLNISYTRNMTLKDGKNSK 


5774 

1 


2 


592 


FVEEEN IRWRCGGS EIjN FRRAVFS ADS KYI FCVSGDFVKVYST 
VTEEC^milaHGHRNLVTGIQLNPNNHLQLYSCSLDGTIKLWDYI 
DGILI KTFI VGCKLHAIjFTIjAQAEDS VFVTVNKE KPD I FQLVS V 
KLPKSSSQEVEAKELS FVLDYINQS PKCIAFGNEGVYVAAVREF 
YLSVYFFKKETTS RVTLS S S 


577S 


3 


538 


SSGCCDPAAPSSIjAEAATMP VS KC P KKSBSLWKGW DR KAQRNG L 
RSQVYAVNGDYYVGEWKDNVKHGKGTQVWKKKGAIYEGDWKFGK 
RDGYGTLSLPDQQTGKCRRVYSGWWKGDKKSGYGIQFFGPKEYY 
EGEJWCGSQRSGWGRMYYSNGDIYEGQWENDKPNGEGMLRLSQNP 

RP 


5776 


2 


484 


RLPQDCVC^NI^ESL#3TI>CPSKGLL 

I IHISRQDFANOTGLVDLTI^RNTISHIQPFSFIJ>LESIjRSIiHIj 
DSNRI*PSI»GEDTLRGIjVNLQHIi I vNKNQLGG I ADEAFEDFIjIiTIj 
EDLDLSYNNLHGPAVGIiRGDAWVQPSTS 


5777 


2 


949 


GODPEPGQDIjFQPEREWPSWGRGREPRIiGKXiRFQNDHIUSVLKQ 

vkkleqalkdgs agldpqlpgtcy sphcppdkaeagstlpeni»g 
ggsgsevsqrvhpsdlegreptpelvedrkgscrrpwdrslenv 

yrgsegsptkpfinpi>pkprrtfkhagegdkdgkpg igfrkekr 
nlpplpslpppplpsspppssvnrrlwtgrqkssadhrksyefe 

DLLO^SSESSRVDVrrAQTKIiGLTRTIaSEENVYTOILDPPMK^^ 
vpnTi?T urmf-T rirvnn KTPP a <I PTQ^T POTT /TKOSL.S KP AFFRO 

NSSRRNV 


5778 


1 


1210 


QRRQSVS RLliLP VFI*I*E P PAE PGLEPP PEEEGGEP AGVAE EPGS 
GGPCWLQLEEVPGPGPLGGGGPLRSPSS YSSDELSPGEPLTS P P 
WAPLGAPERPEHLI^RVLERLAGGATRDSAASDILIiDDIVLTKS 
IiFLPTEKFIjQEIiHQY FVRAGGMEGPEG1K5R^QACXJ\M1i1iHFIiDT 
YQGLLQEEEGAGH 1 1 KDIjYLI»IMKDESI,YQGLREDTIiRIJHQI»VE 
TVEIjKIPEEWQPPSKQVKPLFRHFRRIDSCLQTRVAFRGSDEIF 
CRVYMPDHS YVT IRSRLS AS VQD I LGS VTEKLQ YSEEPAGREDS 
LILVAVS S SGEKVIXQPTEDCVFTAIiGINSHIiFACTRDS YEALV 
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ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corre sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
{A=Alanine, C= Cysteine , D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K= Lysine, 
L- Leucine , M=»Methioninc, N=Asparagine, 
P= Proline, Q=Glutamine, R=Axginine , 
S=Serine, T=Threonine, V-Valine, 
W= Tryptophan , Y=Tyrosine, X= Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








PLP EE IQVS PGDTE IHRVEPEDVANHLTAFHWELFRCVH3LE FV 
DYVFHGE 


5779 


138 


1571 


EAVQVLI KHSAD VIiUU^DKlWQTPliHVAAAN KAVKCAEVI I PLLS 
SVNVSDRGGRTAbHHAAIjNGHVEimiT.tl^GANINAFDKICDRR i 
ALHWAAYMGHLD WALL INKGAEVTCKDKKGYTPLHAAAS NGQ I 
NWKHLLNLGVE I DE I NVYGNTALH I ACYNGQDAWNEL I DYGA 
NVNQPNNNGFTPLHFAAASTHGALCLELLV1WGADVNIQSKDGK 
S PLHMTAVHGRFTRSQTLIQNGGE IDCVDKDGNTPLHVAARYGH 
RT.LlNTLITSGADTAKCCJHSMFPIiHLAALNAHSDCCRKLLSSG 
QKYS I VSLFSNEHVLSAGFEIDTPDKFGRTCLHAAAAGGNVEC I 
KLLQS SGADFHKKDKCGRTPLHYAAANCHFHC 1 ETLVTTGANVN 
ETDDWGRTALHYAAASDMDRNKT I LGNAHDNS EELERARELKEK 
EATLCL3FLLQNDANPS IRDKEG YNS IHYAAAYGHRQCLELLLE 
RTNSGFSESDSGATKS PLHLAVSEMP 


S780 


154 


624 


QPFRVI TCLPFKGPD YRLYKSE PELTTVAE VDESNGEEKSEPVS 
EIETSWKGSHFPVGVVPPRAKSPTPESSTIASYVTLRKTKKMM 
DLRTERPRSAVEQLCLAES TRPRMTVE EQMER I RRHQQACLRE K 
KKCj LNV I GAS DQ S P LQS P SNLRDNP 


578X 


19 


941 


RGS LGGH PWRP PMRAASQGCLP VS F VTG PHQERAYGGRG PGGAF 
PAPPVSGTCPPDLI YAPTPEKAEGGSQKNHQPPPGERAAHRDGE 
QAPCRAGPTRKVAVAPRPPSCP*GPE\PGEEPRRPLDRSPPLGQ 
VaPHFTSQDAKSAEDEAPSRHLGKHQPRSAQVGSRLDALQGPKT 
QHS IHTVTCKS PRQKEDRSPKPPQAP KHPEBHGRQS \0APPP LP 
VAPSRTCGGC* TWDPALLVS P / PQGDST PELPAP \ QQPTGG PS R 
CRQALPPOG*RQQPRQRPR/PTGASRSHPAKAKGCQGPPKIRNY 
NIMD 


5782 


5176 

- 


1237 


DRSMM5 MAADS YTDS YTDT YTEAYMV P PLPPEEP PTMPPLP PE E 
PPMTPPLPPEEPPEGPALPTEQSALTAENTWPTEVPSLPSEESV 
SQPEPPVSQSEISEPSAVPTDYSVSASDPSVLVSEAAVTVPEPP 
PEPESS I TLTPVESAWAEEHEVVPERPVTCMVSETPAMSAEPT 
VLASEP PVMSBTAET FDSMRASGHVASEVSTSLLVPAVTT PVLA 
ESILEPPAMAAPESSAMAVLESSAVTVLESSTVTVLESSTVTVL 
EPSWTVPEPPWAE PDYVTI PVPWSALEPSVPVLEPAVSVLQ 

psmivsepsvsvqestvtvsepavtvseqtqviptevaiestpm 
i less i msshvmkgi nlssgdqnlape i gmqe ialhsgee phae 
ehlkgd fye s ehg ik i dlni nnhliakemehntvcaagts pvge 
igeeki lptsetkqrtvldtypgvseadagetls stgpfale pd 
atg\tskgi efttas tlslvnkyd vdls lttqdtehdmli s ts p 
sggseadiegplpazdihldlpsninlvssdtneplpvkrd\dq 
tlaali \ slkessggekevppps * rehlpdsgfsaniedinead 
lvrpvss prtwnvlps pragl\egp\ llasdfgpvqnlysspw 
\ssmp\erasgs\ssgekgg\yeifvkvkdthetcskknknrdkg 
e kekkrds s lrsrskrs kss ehks rkltses rsrarkrs sks ics 
hrs\qtrsrsrs/rdrrrrssrsrsksrgrrsvskekrkrspkh 
rsksrerkr}o^ssrbnrktvrarsrtpsrrsrshtpsrrrrsr 
svgrrrs fs ispsrrsrtpsrrsrtpsrrsrtpsrrsrtpsrrs 
rtpsrrsrtpsrrrrsrswrrrs fs is pvrlrrsrtplrrrfs 
rspirrkrsrssergrspkrltdldkaqlleiakanaaamcaka 
gvplppnlkpapppt ieekvakksggati eeltekckqi aqske 

DDDVIVNKPHVSDEEEEEPPFYHHPFFCLSEPKPIFFNLNIAAAK 
PTPPKSQVTLTKEFPVSSGSQHRKKEADSVYGEWVPVEXNGEEN 
KDDDNVFS SNLPSEPVDISTAMS ERALAQKRLSENAFDLEAMSM 
LNRAQERIDAWAQLKS IPGQFTGSTGVQVLTQEQLANTGAQAW 1 
KKIXJFIjRAAPVTGGMGAVI^KNGWREGEGIXSKNKEGNKEPILV 
DFKTBRKGLVAVGERAQKRSGNFSAAMKDLSGKHPVSALMEICW 
KRRWQPPE FLLVHDSG PDHRKHFLFRVL INGSAYQPNCM FFLNR 
Y 


5783 


1693 


698 


JDSGLRVAFTMEGISNFKTPSKLSEKJCKSVLCSTPTINIPASPFM 
QKLGFGTGVNVYLMKRS PRGLSHS PWAVKKINP I CNDH YRSVYQ 
KJlLMDEAKILKSLHHPNIVGYRAirn'Jfc^A^ 

NDLIEE/ PI * SQ/PKILFQQP/L ILKVALNMARGLKYLHQEKKL 
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ID 
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Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corre sp ond i ng 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A= Alanine, C= Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G -Glycine, 

H=Hi S t i di n P I=I<?ftlpnrinp IfaLvsine 

L=Leucine, M=Methionine, N-Asparagine, 
P=Proline, Q=Glutamine, R-Arginine, 
S=Serine, T=Threonine , V=Valine, 
W=Tryptophan, Y= Tyrosine, X=Unknown, *=Stop 
Codon. /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








LHGDI KSSNWIKGDFETI KI CDVGVSLPIiDENMTVTDPEACY I 
GTEPWKPKBAVEENGVITDKADIFAFGLTLWEWMTLSIPHINIiS 
NDDDDE D KTFDES D FDDEA YYAALGTRP P INMEELDESYQKVI E 
LPS VCTN ED P KDRPSAAHT VRA LSTDV 


S784 


2669 


1388 


PRVRPRVRTDHNYYI SRI YGPSDSASRDLWVNIDQMB KDKVKIH 
GILSNTHRQAARVrnjSFDFPFYGHFLREITVATGGFXYTGEVVH 
RM I/TATQ Y I AP LMAN FD PS VS RN S TVR Y FDNGTALWQW DHVHL 
QDNYNLGS FTFQATLLMDGRI I FGYKB I PVLVTQ I SSTNHPVKV 
GLSDAF VWHRIQQI PNVRRRTI YEYHRVELQMSKI TNI SAVEM 
TPLPTCLC; FNRCG PCVS SQ IG FN CSWCS KLQRCS SGFDRHRQDW 
VDS GCFEES KEKMCENTEPVET \ FLEP PQP * SRQPPSSGS * LPP 
E/DAVTSQFPTSLPTEDDTKIAXJiLKDNGASTDDSAAEKKGGTL 
HAGLI VGI LI LVLI VATAILVTVYMYHH PTSAAS I FFI ERRPSR 


5765 


2669 


1388 


PRVRPRVRTDHNYYISRIYGPSDSASRDLWVNIDQMEKDKVKIH 
GI LSNTHRQAARVNLSFDFPFYGH FLRB I TVATGGF I Y TGE WH 
RMLTATQYIAPLMANFDPSVSRNSTVRYFDNGTAX,VVQ;7DHVHL 
QDNYNLGS FTFQATLLMDGRIIFGYKEIPVL.VTQISSTNHPVKV 
GLSDAFWVHRIQQI PNVRRRTI YEYHRVELQMSKI TNI SAVEM 
TPLPTCLQFNRCGFCVSSQIGFNCSWCSKLQRCSSGFDRHRQDW 
VDSG CPEES KEKMCENTE P VET \ FLEPPQ P * ERQPPSSGS*1*PP 
B/DAVTSQFPTSLPTEDDTKIALHLKDNGASTDDSAAEKKGGTL 
HAGL I VG I LI LVLIVATAI LVTVYMYHHPTS AAS I FFI ERRPSR 
WPAMKFRRGSGHPAYAEVEPVGEKEGFIVSEQC 


5786 


2532 


1674 


SYKLPAAERRASSCSQPPTPTRRRWPAPGRTSRGHRPQM * SGTP 
APRP PARSTVSPASPLPKPRAGRCGSRFRSACSTFRPC* SLN*M 
S*H*KRNLSQRSSSMSRRPLSCARPHR* *RQGLTVAARLPTWAK 
SPPLACSFCQAAQKSQSLSSGRSTR*PERMSFRP\SPPGNPAIP 
SLAPSSRP/ PKGRPQCTWI PSRWPASPTAPPTTT* APTSSPGST 
GRSMMTCPTRWTATPWSARASSRPRNWPTP * WRPSGRLSTV*RA 
TGGSTATAPPKRFPRNWNPMMAB 


S787 


2 


1460 


MASAAS VTSLADEVNCP \ I CQGTLKEAGSLSNCG /HKNFCRACL 

VENIERLQLVSTLGLGEEDVCQEHGEKIYFFCEDDEMQLCVVCR 
EAGEHATHTMRFLEDAA\APYREQIHKCLKCLIKEREEIQEIQS 
RENKRMQVLLTQVSTKRQQVT SEFAHLRKFLEEQQS ILLAQL3S 
QDGDI LRQRDE FDLLVAGEI CRFSALIEELEEKNERPARBLLTD 
IRSTL I RCETRKCRKPVAVS PELGQRI RDF PQQALPLQREMKMF 
LEKL CFELD YE PAHI SLDPQTS HPKLLLSEDKQRAQFS YKWQNS 
PDNPQRFDRATCVLAHTGI TGGRHTVJVVSIDLAHGGSCTVGVVS 
EDVQR KGELRLRPEEGVWAVRLAWGFVS ALGS FP\ TRLTLKEQ P 
RQWVSLDYEVGWVTFTNAVTREPIYTFTASFTRKVIPFFGLWG 
RGSSFSLSS 


5788 


2 


6860 


EHSVSGRS S AYGDATAEGHPAGPGS VS S STGAIS TTTGHQEGDG 
SEGEGEGETEGDVHTSNRLHMVRI^LERLLQTLPQLRNVGGVR 
AIPYMQVI LMLTTDLDGEDEKDKGALDNLLSQLI AELGMDKKDV 
SKKNERSAIJSIEVHLVVMRLLSVFMSRTKSGSKSS I CESSSLI SS 
ATAAALLSSGAVDYCLHVLKSLLEYWKSQQNDEEPVATSQLLKP 
HTTSS PPDMSP FFLRQ YVKGHAADVFEAYTQLLTEMVLRLP YQ I 
KKITDTNSRIPPPVFDHSWFYFI^EYLMIQQTPFVRRQVRXLLL 
FICX^KEKYROIiRDIjHTIiDS\HWGIKKIJ^EECGI FLRASWTA 
SPQSALQ YDTLI SLMEHLKACAE IAAQRTINWQKFCI KDDSVLY 
FLLQVS FLVDEGVSPVI^LLSCALCGSKVI^AIiAASSGSSSAS 
SSPAPVAASSGQATTQSKSSTKKSKKEEKEKEKDGETSGSQEDQ 
LCTALVNQI^FADKETLIQFI^CF^LESNSSSVRWQAHCLTLH 
I YRNSSKSQQELLLDLMWSI WPELPAYGRKAAQFVDLLGYFSLK 
TPQXEKKLKETYSQKAVEIIJITQNH3XTNHPNSNIYNTLSGLVEF 
DGYYLES DPCLVCNNPEVPFCY I KLSS I KVDTRYTTTQQVVKLI 
GS HTI S KVTVKIGDLKRTKMVRT I NL. YYNNRTVQAXVELKN KPA 
RWHKAKKVQ1.TPGQTEVKIDLPLP I V ASNLM I E FAD FYENYQAS 
TBTLQCPRCSASVPANPGVCGNCGENVYQCHKCRS INYDEKDPF 
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Amino acid segment: containing signal peptide 
(A^Alanine, C=Cysteine, D=Aspartic Acid, E« 
Glutamic Acid, F=Phenyl alanine, G-Glycine, 
H-Histidine, I=Isoleucine, K= Lysine, 
L= Leu cine, M=Methionine / N=Asparagine, 
?=Prol ine, 0>Glutamine , R=Arginine, 
S=Serine, T=Threonine, V= Valine, 
W=Tryptophan, Y-Tyrosine, X=Unfcnown, *=Stop 
Codon, /-possible nucleotide deletion, 
\ -possible nucleotide insertion) 


* 






LCNACGFCKYARFDF?1LYAKPCCAVDPI ENEEDRKKAVSN INTL 
LDKADRVYHQLMGHR PQLENLLCKVNEAAPEKPQDDSGTAGGIS 
STS ASVNRYI LQLAQE YCGDCKNSFDELSKI I QKVFAS RKELLE 
YDLQQREAATKS SRTS VQ PTPTASQ YRALS VLGCGHTS STKCYG 
CASAVTEHCITLLRALATNPALRHILVSQGLI RELFDYNLRRGA 
AAMREEVRQLMCLLTRDNPEATQQMNDL 1 1 G KVSTALKGHWANP 
DLASSLyYEMLLLTDS ISKEDSCWELRLRCALSIiFLMAVNI KTP 
VWENITLMCLRILQKLI KP PAPTS KKNKDVP VEALTTVKPYCN 
EIHAQAQLWLKRDPKASYDAWKKCLPIRGIDGNGKAPSKSELRH 
LYLTEKYVWRWKQFLSRRGKRTSPIJ>LKIjGHNNWLRQVLFTPAT 
QAARQAACT IVEALAT I PSRKQQVLDLLTS YLDELS I AGECAAE 
YLALYQKLITSAHWKVYLAARGVLPYVGNLITKEIARLLAI^ 
TLSTDLQQG YALKSLTGLLSS FVE VES I KRHPKSRIiVGTVLNG Y 
LCLRKLWQRTKLI DETQDMLLEMLEDMTTGTESETKAPT4AVCI 
ETAKRYNLDDYRTPVFI FERX*CS I IYPEKNEVTEFFVTLEKDPQ 
QEDFLQGRMPGNPYSSNEPGIGPLMRDI KNKI CQDCDLVALLED 
DSGMELLVNNKIISLDLPVAEVYKKVWCTTNEGEPMRIVYRMRG 
IiIjGDATEE FI ESU3STTDEEEDEEEVYJCMAGVMAQCGGI>ECMIjN 
RLAGI RDFKQGRHLLTVIjLKIjFS YCVKVKVNRQQLVKLEMNTLN 
VMLGTLNLALVAEQESKDSGGAAVAEQVLS 1 ME I \ IQAEPNVEP 
LSEDKG1«J^TGDKDQLVMLLl>QINSTFVRSNPSVLCGLIiRIIP 

ylsfgevekmqilverfkpycnfdkydedhsgddkvfl\dcfck 
iaag i k\nnsnghql\ kdl\ i lqxg itqnald\ ymkkhi p/saa 
ri wdadi \wksfclrpalip filrllrglaiqhpgtqvligtdsi 
pnlhkleqvs\sdegigtla\f^nl\leslrehpdvnkkida\ar 

RETRAEKKRMAMAMRQKALGTLG \MTTNEKGQVVD/TRTALLEA 
DWEELI EEP\GLTCCICREG YKFQPTKVLG I YTFTKRWLGGVW 
ENKPRETSRATS TVSHFN I VH YDC \ HLA\ AVS LARGREE WES AA 
LQNANTKCNGLLP VWGPHVPE S AFATCLARHNT YLQECTGQREP 
TYQLN IHDI KLLFLRFAMEQS FSADTGGGGRBSN IHLI P Y I IHT 
GLYVl*NTTRATSREEKNLOGFLEQPKEKWVESAFEVIX3PYYFT^ 
LALHI LPPEQWRATRVEILRRLiVTSQARAVAPGGATRLTDKAV 

KDYSAYRSSLLFWALVDL I YNMFKKVPTSNTEGGWSC5LAE YI R 
HNDMPIYEAADKALKTFQEEFMPVETFSEFLDVAGLLSEITDPE 

SFLKDLLNSVP 


57S9 


1 


2407 


LPLHAVEKTGRPGQPAXiKMPGKliRSDAGLESDTAMKKGETLRKQ j 
TEEKE KKE KPKS DKTEE I AEEE ETVFPKAKQVKKXAE PSEVDMN 
SPKSKKAKK\KEE PSQNDI SPKTKS LRKKKEPI EKKWSSKTKK 
VTKNEEPSEEEIDAPKPKKMKKEKEMNGETREKSPKLKNGFPHP 
EPDCNPSEAASEESNS EIBQEI PVEQKEG\AFSNFP ISEET I KL 
LKGRGVTFLFP IQAKTFHHVYSGKDLI AQARTGTGKTFSFAI PL 
I EKLHG\ EI^DRiOiGRAPQVLVXAPTRELANQVSKDFSDITKKL 
S VACFYGGTPYGGQFERMRNG I DILVGTPGRI KDH I QNGKLDLT 
KLNHVVIJ3EVDQMLDMGFADQ V EEI LS VA x iuujS EDN ry I itut z> 
ATCl^fVFNVAKKYMKSTYEQV^LIGKKTQKTAITVEHIiAJKCH 

WTQRAAVIGDVIRVYSGHQGRTI I FCETKKEAQELSQNSAI KQD 
AQSLHGDI PQKQREI TLKGFRNGS FGVLVATNVAARGLDI PEVD 
LVIQSS PPKDVES YI HRSGRTGRAGRTGVCI CFYQHKEEYQLVQ 
VEQKAG I KFKRIGVPSATE 1 I KASS KDAI RLLDSVP PTAI SHFK 
QSAEKLIEEKGAVEALAAALAHI SGATS VDQRSLINSNVGFVTM 
T T.nr*c T FMPMT Q Yjxwttet .KPH LGEE IDS KVKGMVFLKGKLGVCF 
DVPTASVTE IQEKWHDSRRWQLS VATEQPELEGPREG YGG FRGQ 
REGS RG FRGQRDGNRR FRGQREG SRG PRGQR SGGGNKSNR S QNK 
GQKRSFSKAFGQ 


5790 . 


3786 


1585 


ARRQRDPLQAlJUUmQELKQQVDSLLSESQLKEALEPNKRQHIY 
QRCI QLKQAIDENKNALQKLSKADESAPVANYNQRKREEHTLLD 
KLTQ-QLQGLAVTISRENITEVGAPTEEEEESESEDSEDSGGEEE 

DAEEEEEE KEENESHKWS TGEE Y IAVGDFTAQQVGDLTFKKGEI 
LLVIEKKPDGWWIAKDAKGNEGLVPRTYLEPYSEEEEGQESSEE 

GSEEDVEAVDETADGAEVK\QRTDPHWSAVQKAISEAG I FCLVN 
HVSFCTLIVl^RNRMETVEDTNGSETGFRAWNVQSRGRIFLVSK 
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ID 

NO: 


Predicted 
beginning- 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C= Cysteine , D^Aspartic Acid. E= 
Glutamic Acid, F=Phenylalanine, G-Glycine. 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P= Proline, Q=Glut amine, R=Arginine, 
S=Serine, T=Threonine, V= Valine, 
W -Tryptophan , Y=Tyrosine, X-Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\-possible nucleotide insertion) 








PVDQQINTVDVIjTTMGAI PAG FR PSTIjS QLJjEEGNQ FRAN YF1»Q ' 

PEn^MPSQIiA>-RDLMWDATEGTIRSRPSRISLILTLWS CKMI PLP 

GMSIQVI^RHVRLCLFIX^KVLSNTHTVI^TWQPKKPKTWTFS P 

QVTRILPCLLDGDCFIRSNSASPDLGILFELGISYIRNSTGERG 

ELSCGWFLKLFE^GVPIPAKTYELFLNGGTPYEKGIEVDPSI 

SRRAHGSVFYQIhTTMRRQPQLLVKI^IiNRRSRNVI^LL 

GlWCSIHI^IFYRQIIX^VIiLKI)RMSIjQSTDiaSHPMIATFP^ 

LEQ PDVMDALRSS WAGQES \ TLKRS EKR \ P KEFTiKV PR FTxLVYH 

\GCVLPLL/HTPTRI,PPFRWAEEETETARWKVITDFLKQKQENQ 

GALQALLSPDGVHEPFDbSEQTYDFLGEMRKNAV 


5791 


3 


1636 


LRVAEFAGTSR/IGAGLIQPLHRAPARDHGL^RGGAAPAIjSVSH 
GN/GKQL/ AMSSQGSDDEQI KRENIRSLTMSGHVG FESLPDQLV 

iii\oiyyvjr vf Jixiiv> vvjo x uxoiw x uiu x xjrxv iwr x Cioonx v_.ir 

NVKLKAQTYEIjQESOTQLKIiTIVOTVGFGDQINKEESYQPIVDY 
IDAQFEAYLQEEUCI KRSLFT YHDSR IHVC1»YFIS PTGHSUCTL 
DLLTMKNLDSKVYI I P VT AKADTVS KTELQKFKI KLMS ELVSNG 
VQIYQFPTDDiyriTuXVNAAMNGQl^PFAVVGSMDEVKVGKKMVKA 
RQYP WGWQVENENHCDFVKLREML I CTNMEDLREQTHTRH YEL 
YRRCKLEEMGFTDVGPENKPVS VQET YEAKRHEFHG ERQRKEEE 
MKQMFVQRVKEKEAILKEAERELQAKFEHLKRIjHQEERMKIiEEK 
RRLLEEEI IAFSKKKATSEI FHSQSFLiATGSNLRKDKDRKNSQF 
FVKQKVPEHRRSSSQANF I KKKLEVCFD FAVT CF ITS I FGEQPQ 
LLI FMEKYFQVQGQYI SQSE 


5792 


2263 


653 


AAAAPS PAWWCpVFVVYVVHTCWVMYGI V YTRPCSGDAS C IQPY 
IJu^PKIX2lARHSFTTTRSHI^AE 

TVNV? VPTflfTR MMrrPT.YA VT P*T .HHAfiVT .PWMDT? KOVHL.V 4 ? PT iTT 

X V 11 V O V t IVA.11UI1VO X XJ J/iX X XT X^JTLTLrXO • UlrritUA3AU V flu v O IT XJ X X 

YKVPKPEEIl^LTCESDTQQIEADKKPTSALDEPVSHWRPRLiAIj 
NVMADNFVFDGS SI*PADVHR YKKM I QLGKTVHYIiP I LF I DQLSN 
RVKDLMVl^STTELPLTVS YPKVSl^RLRFW I HMQDAVYS LQQ 
PGFSEKDADEVKGIFVDTKTkYFLAIjTFFVAAF 

ISx^TKKKKSMIGMSTXAVLimCFSTWIFLFx^DEOT 
AGVGAAI EEW KVKKA L KMT I FWRG LM P E FQ FGT Y S ESERKTEEY 
DTQAMKYT^ YIxTjYPLCVGGAVYSLIiN I KYKS WYS WLINS FVNGV 
YAFGFLFMLPQLFVrmCLKSVAHLPWKAFTYKAFNTFIDDVFAF 
IITMPTSHRiyVCFRDDvVFLVYxjYQRWIjYPVDKRRVNEFGESYE 
EKATRAPHTD 


5793 


2263 


653 


AAAAPSPAWWCGVFWXVVHTCMVMYGIVYTRPCSGDASCIQPY 
LARRPKIXJL\RHSFTTTRSHLGAE1WIDLVLNVEDFDVESKFER 
TVNVSVPIOCTRNKGTIjYAYI FLHHAGVXiPWHDGKOVHLVS PLTT 
YMVPKPEEINLLTGESDTQQIFJUDKKPTSALDEPVSHWRPRLAL 
NVMADNFVFDGS S LPADVHR YMKM I QI/3KTVHYLP I LFIDQIjSN 
RVKDLMVINRSTTELPI/TVS YDKVStGRLRFW IHMQDAVYSLQQ 
FGFSEKDADEVKG I FVDTx^YFLALTFFVAAFHlxLFDFxxAFKND 
ISFWKKKKSMIGMSTKAVIjWRCFSTWI FIxFIxIiDEQTSLLVIiVP 
AGVGAAI EIiWKVKKAIxKMTI FWRGLMPEFQFGTYSES ERKTEEY 
DTQAM KYIiS YL L Y PL CVGG A VYS LLNI KYKS WYS WL INS FVNGV 
YAFGFLFMLP^LFVNYKLKSVAHLPWKAFTYKAFNTFIDDVFAF 
I I1T4PTSHRIjACFRDDVVFLVYI^YQRWLYPVI>KRRVNEFOT 
EKATRAPHTD 


5794 


1 


5016 


MGPRI*SVWLLxxLPAALlxLHEEHSRAAAKGGCAGSGCGKCDCHG^ 
KGQKGERGLPGIiOGyiGFPGMOGPEX3PQGPPGQKGDTGEPGIiPG 
TKGTRGPPGASGYPGNPGLPGI PGQDGPPGPP3I PGCNGTKGER 
GPLGPPGLPGFAGNPGPPGLPGMKGDPGE I LGHVPGMLLKGERG 
FPGI PGTPGP PGLPGLQ3P VGP PGFTG P PG P PGPPGPPGEKGQM 
GLS FOG PKGDKGDQGVSGPPGVPGQAQVQEKGDFATKGBKGQKG 
EPGFQGMPGVGEKGEPGKPGPRGKPGKDGDKGEKGSPGFPGEPG 
YPGIxIGRQGPXQGEKGEAGPPGPPGIVIGTGPLGEKGERGYPGT 
PGPRGEPGPKGFPGLPGQPGPPGLPVPGQAGAPGFPGERGEKGD 
RGFPGTSLPGPSGRDGLPGPPGSPGPPGQPGYTNGIVECQPGPP 
GDQGPPGI PGQPGFIGEIGEKGQKGESCLI CD IDGYRGPPGPQG 

P PGE IGFPGQ pgakgdrglpgrdgvagvpgpogtpgl igqpgak 
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SEQ 
ID 
NO: 


Predicted. 

beginning 

nucleotide 

location 

c orr e sponding 

to first 

amino acid 

residue of 

amino acid 

sequence 


Predicted end 
nucleotide 
location 
corre sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, OCysteine, D=Aspartic Acid, E=» 
Glutamic Acid, F=Phenylalanine, G=Glycine. 
H=sHistidine, Is=Isoleucine, K=Lyeine, 
L=Leucine, M=Methionine , N=Asparagine , 
P= Proline, Q=Glut amine , R=Arginine, 
S=Serixie, T^Thxeonine, V= Valine, 
W=Tryptophan, Y= Tyro sine, x=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








GEPGEFYFDLPJLKGDKGDPGFPGQPGMPGRAGSPGPJX3HPGLPG 
PKGS PGS VGLKGERG P PGG VGFPGSRGDTG PPGP PGYG PAG P IG 
DKGQAGFPGG PGSPGLPGPKGE PGKIVPLPGPPGAEGLPGS PG F 
PGPQGDRGFPGTPGR\ PGL\ PGEKGAVG\QPGIG FPGPPGPK3V 
DGLPGDMGP PGTPGRPG FNGLPGNPGVQGQKGEPG VGIjPG1»KGL» 
PGLPG I PGTPGEKGS IGVPGVPGEHGAIGPPGLQGIRGEPGP PG 
IiPGSVGSPGVPGIGPPGARGPPGGQGPPGLSGPPG I KGEKGFPG 
FPGIiDMPGPKGDKGAQGLPGITGQSGLPGLPGQQGAPG I PGFPG 
SKGEMGVMGTPGQ PGS PGPWGAPGLPGEKGD\HGFPGSSGPRGD 
PGLKGDKGDVGLPGKPGSMDKVYMGSMKGQKGDQGEKGQIGPIG 
EKGSRGDPGTPGVPGKDGQAGQPGQPGPKGDPGISGTPGAPGLP 
GPKGSVGGMGLPGTPGEKGVPGIPGPQGSPGLPGDKGAKGEKGQ 
AGPPG IGI PGLRGEKGDQG IAGFPGS PGEKGEKGS IGIPGMPGS 
PGLKGSPGSVGYPGSPGLPGEKGDKGLPGLDGIPGVKGEAGLPG 
TPGPTG P AGQKGE PGSDG I PGS AGEKGE PGLPGRG FPGFPGAKG 
DKGS KGE VG FPG1»AGSPG I PGSKGEQG FMGPPG PQGQPGL PGS P 
GHATEGPKGDRGPQGQPGL PGLPG PMGPPGLPG I DGVKGDKGNP 
GWPGAPGVPGP RGD PGFQGM PGIGGS PG I TGS KGDMG PPGVPG F 
QGPKGLPGLQG IKGDQGDQGVPGAKGLPGPPGPPGP YDI IKGEP 
GLPGPEGPPGLKGLQGLPGPKGQQGVTGLVGIPG PPG I PGFDGA 
PGQKGEMGPAGPTGPRGFPGPPGPDGLPGSMGPPGTPSVDHGFIi 
VTRHSQT IDDPQCPSGTKI Ij YHGYS LL YVQGNERAHGQDLGTAG 
SCI^KFSTMPFLFCNINNVCNFASRNDYSYWLSTPEPMPMSMAP 
I TGENI RP FISRCAVCEAPAMVMAVHSQTIQIPPCPSGWSSLW I 
GYS FVMHTS AGAEGSG QALAS PG SCLE E FRSAP FIE CHGRGTCN 
YYANAYS FWL»AT I ERS EMF KKPTPSTLKAGELRTHVSRCQVCMR I 
RT 


579S 


1192 


61 


STRSPTVEYISAHPHILFMIiLKG YEAPQ IALRCGI MLRECIRHE 
PLAKI II,FSNQFRDFFKYVELSTFDIASDAFATFKDLLTRHKVL 
VADFLEQNYDTIFEDYEKLI^SH^tfYVTK^QSL 
FAIOTKYISKPFJ^KI*MMNLLRDKSPNIQFEAFHVFKVFVASPH 
KTQPIVE 1 1»I*KNQPKL»I EFLS S FQKERTDDEQFADEKNYT*! KQI 
RDLKKTAP*RALRDSKR 


5796 

» 


2 


1078 


GRVGWELWCMY ISPPKDWWDAGDPSLP IRTPAMIGCS FWNRKF 
FGE IGIjLDPGMDVYGGENIELGI kvwlcggsmevlpcsrvahie 
RKKKP YNSNIGF^TiQiNALRVAEVWMDD YKS HVY IAWNltPLEKP 
GIDIGDVSERRAIiRlCSLKCKNFQWYlJDHVYPFJiRRYNNTVAYGE 
LRNNKAKDVClaDQG PLEMHTA I LYPCHGWG PQLARYTKEG FLHL 
GALGTTTLLP DTRCLVDNS KSRL PQLI*D CDKVXSSLYKRWNFIQ 
NGAIMNKGTGRCLEVENRGLAG IDLILRSCTGQRWTI KNS I K*R 
EGAGJU^PGPQDMAAPPNIWTSCPGGErTARGRQVLDGPPRASPG 
QHRDPG 


5797 


2 


891 


PRVRQKTI-VDVTIiENSNI KDQI RNI^2QTYBASt4DKLREKQRQLE 
VAQVENQIJ^KMKVESSQEANAEVMREMTKiaYSQYEEKI^EEQR 
KHSAEKEJU^LEETNSFLKAIEEANKmOAAEISLEEKIX^RIGEX 
DRLIERiMEKERHQLQLQLLEHETEMSGELTDSDKERYQQLEEAS 
A\SLRERIRHLNDMVHCQQKKVKC^4VEEIESLKKKLQQKQIiLILQ 
LLEK I S FLEG ENNELQ SRLDYliTE TQ AKTE VE7RE I GV GCD LLiP 
SQTGRTRE I VMP S RN YTP YTR VLEJbTM KKT LT 


5798 


644 


115 


KILGSRWKSMSWQEKOPYYEEQARI*SKIHLEKYPNYKYKPRPKR 
TC IVDG KKLR I G3 YKQLMRSRRQEMRQ FFTVGQQPQ1P 1 TTGTG 
WYPGAITMATTTPSPQMTSDCSSTSASPEPSLPVIQSTYGMKT 
DGGS LAGNEM I NGEDEMEM YDDYEDDPXS DYSSENEAPEAVS AN 


5799 


2679 


1435 


LLSTY I KFINLFPETKATIQGVLRAGSQLRNADVELjQQRAVE YIj 
TLSSVASTDVIjATVIiEEMPPFPERESSILAKLKRKKGPGAGSAL 
DDGRRDPSSND INGGME PTPSTVSTPS PSADLLGLRAAPPPAAP 
PASAGAGNLL VD VFDG PAAQ PS LG PTPEEAFLS PG P EDIGPP I P 
EADELLNKFVCKNNGVLFENQLIiQ I GVKSEFRQN1*GRMYLFYGN 
KTSVOFQNFS PTVVHPGDLQTQLAVQTKRVAAQVX>GGAQVQQVI> 
HIECLRDFTjTPPI^SVRFRYGGAPQALTLKLPVTINKFFQPTEM 
AAQDFFQRWKQLSLPQQEAQKI FKANHPMDAEVTKAKL1X3FGSA 
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SEQ 
ID 
NO: 


predicted 
peginni ng 
nucleotide 
location 
corre spondi ng 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 

DUCX cOL J.UC 

location 
corre spondi ng 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
iH-Hiduine, C-Lysceiiie, u^Kspamc -c» — 
Glutamic Acid, F= Phenyl alanine, (^Glycine, 
H=Histidine, I~Isoleucine r K=Lysine, 
L= Leucine, M=Methionine, N^Asparaginc, 
P-Proline, Q=Glutamine, R=Arginine, 
S^Serine, T=Tnreonine, V=Valine, 
W =Tryp c ophan , Y= Tyrosine, X= Unknown, *=Stop 
Codon, /=possifc>le nucleotide deletion, 
\=possible nucleotide insertion) 








LLDNVDPNPENFVGAG I IQTKAMVGCI.LRLEPNAQAQMYRI.T1j 
RTSXEPVSRHLCELLAOOF 


5800 

• 


2679 


1435 


LLSTYI KF INLFPETKATI QGVLRAGS QLRNADVELQQRAVE YL 
TI*SSVASTDVLATVLE3MPPFPER£SS ILAKL>KRKKG PGAGSAL 
DDGRRDPSSNDINGGMEPTPSTVSTPS PSADLLGLRAAPPPAAP 
PASAGAGNLLVDVFDGPAAQPSLGPTPEEAFI^PGPEDIGPPIP 
EAD^7J,NKFVO<NNGVLFENQLl^IGVKSEFRQNI^RMYL.FYGN 
KTS VQFQNFS P TVVHPGDUnTQIAVQTKR VAAQVMG AQVQQVL 
NI ECLRDFLTPPLLSVRFRYGGAPQALTLKLPVTINKFFQPTEM 
AAQDFFQRWKQLSLPQQEAQKI FKANHPMIIAF/TTKAKLLGF'GSA 
LLDNVDPNPENFVGAG I IQTKALQVGCLLRLEPNAQAQMYRLTL 
RTSKEPVSRHLCELLAQQF 


5801 


3 


1413 


FPRLYHLI PDGEITS IKINR VDPSESLS I RL»VGG£>Iil PJbVHl X 1 
QHI YRDGVI ARDGRLLPGDI ILKVNGMD I SNVPHNYAVRLLRQP 
CQ VLWLTVMREQKFRS RNNGQAPDAYR PRDDSFHVILNKSSPEE 
yjjCJ 1 KXjV RK.VL/c»Pvj V r X r N VJjIJvjnjVAx KnuUL^WUKVii/Uftlun 
DLRYGS P ESAAHLI QAS ERRVHLWSRQVRQRS PDI FQEAGWNS 
NGSWS PGPGERSNT P KPLKPT I TCHE KWN IQ KDPGB S LGMTVA 
GGAS HREWDLP IYVISVE PGGVI SRDGRI KTGDI LLNVDGVELT 
BVSRSEAVALLKRTSSS I VX.KALEVKEYEPQED CSS PAALDSNH 
NMAPPSDWSPSWVMWLELPRCLYNCTKDIVLRRNTAGSLGFCIVG 
G YEE YNGNKPFFI KS I VEGTPAYNDGR I RCGD ILLAVNGRSTSG 
MIHACLARLLKELXGRI TLT I VS WPGTFL 


5802 


3 


290 


CFSLxQIMERlMDLPTLLRilAt'KEMFSVGGLFWMr« 
GAFFYLISPLDFVPEALFGILGFLDDFFVIFLLLIYISIMYREV 

ITQRLTR 


5803 


2234 


1299 


EAQFGTTAEIYAYREEQDFGIEIVKVKAIGRQRFKVLELRTQSD 
GIQQAKVQILP2CVLPSTMSAVQLESLNKCQIFPSKPVSREDQC 
SYKWWQKYQKRKFHCAE^TSWPRWLYSLYDAETLMDRIKKQLRE 
WDENLKDDSLPSNPIDFSYRVAACLPIDDVLRIQLLK1GSAIQR 
LRCELDIMNKCTSLCCKQCQETEITTKNEIFSLSLCGPMAAYVN 
PHG YVHETLTVYKACNLNLIGRPSTEHSW FPG YAWTVAQ CKI CA 
SHIGWKFTATKKDKS PQKFWGLTRSALLPTI PDTEDE IS PDKVI 
LCL ' 


5804 




1707 


EMEKQRQEEQRKRTEEERKRRIEQDMLEKIUUQREIjAKRAEQIE 
DINNTGraSASEEGDDSLLITWPVKSYKTSGKMKKNFEDLEKE 

Kohl K K hd 1 K Y £j hi L> K.K JLJKx £»r»y rtfi LtT^JiiHJS. VIjo Jj V SrUJUd ico i_.Airvxv 
ESLS PGKLKLTFEELERQRQENRKKQAEEEARKRLEEEKRAFEE 

arrqmvnedeenqdtaki fkgyrpgklkls feemerqrredekr 
kaeeearrrieeekjcafaearrijmvvdddspemyktisqefltp 
gkleinfeellkqkmeeekrrteeerkhklemekqefeqlrqem 
geeeeenetfglsre yeeli klkrsgs iqaknlkskfekigqls 

pinp Tft'PCKTEEE'RJVR'RI? A TDLE I KEREAENFHEEDD VDVRPAR KS 

eapfthkvnmkarfeqmaxareeeeqrrieeqkxlrmqfeqrei 
daalqkkreeeeeeegs imngstaedeeqtrsgapwfkkplknt 

SWDS E PVRFTVKVTGE P KP E ITWWFEGE I LODGED YQ Y I ERGE 
TYCLYLPETFPEDGGEYMCKAVNNKGSAASTCILTIESKN 


5805 


3 


776 


Y IS DTLGQVYKS KIRWV7 1 EENGGNGNIS VDDL IALLDLAEHASS 
AFKESQQQSBDRE YE VKERLYPKS KRRYDTYNIAG YQGE I EVGL 
YTIQILQLIPFFDNKNELSKRYMVNFVSGSSDIPGDPNNEYKLA 
LKNYI P YLTKLKFSLKKSFDFFDEYFVLLKPRNNI KQNEEAKTR 
RKVAGYFKKYVDIFCLLEESQNNTGIXSSKFSBPLQVEROIRIILV 

ALKADKFSGLLE YLI KSQEDAI STMKC I VNEYTFLLK 


5806 


1257 


877 


AVFTFHNHGRTANLYSLHSWLGITTVFLFACQRFLGFAVFIxLPW 
ASMWLRS LLKP 1 HVFFGAAI LSLS IAS VI SGINEKLFFSLKNTT 
RPYHSLPSEAVFANSTGMLWAFGLLVLYILLAS SWKRP 


5807 


; 22S7 


13 02 


RFS KKT FRRPMAVDI QP ACLGLYCGKT LLFKNGSTE I YGE CGVC 
PRGQRTNAQKYCQPCTES PEL YDWL YLG FMAMLPLVLHWF F I EW 
YSGKKSSSALFQHITALFECSMAAI ITLLVSDPVGVLY IRS CRV 
LMLSDWYTML YNPS PDYVTTVHCTHKAVYPLYTI VF I YYAF CLV 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A= Alanine, C= Cysteine , D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K= Lysine, 
L=Leucine, M=Me thionine, N«=Asparagine, 
P = Pr ol i ne , Q=Glu t amine , R=Arg in ine , 
S=5erine , T=Threonine , V=Val ine, 
W=Tryptophan, Y=Tyxosine, X=Unknown , *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








LMMLLRPI*I*VKKIACGLGKSDRFKS IYAALYFPP I LTVLQAVGG 
GLLYYAFPYI ILVLS LVTLAVYMSASB I ENCYDIoLVR KKRI*I VI» 
F S HWLLHA YG 1 1 S I S R VDKLE QDLP IaLAL VPTP AL FY L FTAKFT 
EPS RXLS EGANGR 


5808 


2 


433 


SLPDSGVVEYIjSNGGVADNHKDFGELRYNECX*MNFS cngkngs s 
EGR ITHGFQIjKSAYENNLMPYTNYTFDF KGV IDY I FYSKTHMNV 
LGVLGPLDPQWLVENNITGCPHPHIPSDHFSLLTQLELHPPIiLP 
LVNGVHLPNRR 


5809 


464 


2422 


ILVPGFQG I LHPGVY CALQSQHQAQELVADIDECEVSGLCRHGG 
RCVNTHGSFECYCMIX3YIJRNGPEPFHPTTDATSCTEIDCGTPP 
EVPDGYI IGNYTSSIX5SQVRYACREGFFSVPEDTVSSCTGLGTW 
ES PKLHCQE INCGNP PEMRHAI LVGNHS SRLGGVARYVCQEGFE 
S PGG KI TS VCTEKGTWRESTI»TCTE ILTKINDVSLFNDTCVRWQ 
INSRRINPKI S YVI S IKGQRLDPMESVREETVNLTTDSRTPEVC 
LALYPGTNYTVNISTAPPRRSMPAVIGFQTAEVDLLEDDGSFNI 
S I FNET CIj KliNRRS RKVG S EKMYQ F WLGQR WYLAN FS HATS FN 
FTTREQVPWCLDLYPTTDYTVNVTLLRSPKRHSVQI T IATP PA 
VKQTISNI SGFNETCLR WRS I KTADMEEMYLFH I WGQRWYQKE F 
AQEMTFNISSSSRDPF^CIJDIjRPGTNYKVSL.RAIjSSELPVVISL 
TTQITEPPLPEVEFFTVHRGPLPRIjRIjRKAKEKNGP I SS YQVIAT 
LPLALQSTFS CDSEGASSFFSNASDADGYVAAELLAKDVPDDAM 

EI PIGDRI, yyge yynaplkrgsdyci xlritsewnkvrrhscav 

WAQVKDSS I*MI#LQMAGVGLGSLAWI XLTFLS FSAV 


5810 


3 

* 


1641 


KVFGTHKDHE VSTLDT AI S AVKVQLAEFLENLQE KSliR I EAFVS 
E IESFFNT I EENCSKNEKRLEEQNEEMMKKVliAQYDEKAQSFEE 
VKKKKME FLHEQMVHFLQSMDTAKDTLETIVREAEELDEAVFLT 
S FEEINERIOiSAMESTASLEKMPAAFSIjFEHYDDSSARSDQMLK 
QVAVPQPPRLE PQEPNSATSTTIAVYWSMNKEDVTDS FQVYCME 
EPQDDQEVNELVEEYRI/TVKES yci FFJDLEPDRCYQVWVMAVNF 
TGCSLPSERAI FRTAPSTPVIRAEDCTVCWNTAT IRWRPTTPEA 

tetytleycrqhspegeglrsfsgi kglqlkvnlqpndnyffyv 
rainafgtseqseaalistrgtrflllretahpalhisssgtvi 
s fgerrri/te i ps vlgeelps cgqhywettvtdcpayrlgi css 
savqagalgqgetswymhcseporytffysgivsdvhvterpar 

VG IIjLDYNNQRIj I FINAESEQLLF 1 1 RHRFNEG VHPAFALEKPG 
KCTLHLG1BPPDSVRHK 


5811 


1918 


851 


AAALADPLPEDKWSAEKRRPLKSSLGYEITFSLLNPDPKSHDVY 
WDIEGAVRRYVQPFI^AIXIAAGNFSVDSQILYYAMLGVNPRFDS 
AS SS YYLDMHSliPHVINPVESRLGS S AASL YPVI*N FliiYVPELA 
HSPLYI QDKDGAPVATNAF1ISPRWGGIMVYNVDS KTYNASVLPV 
RVEVDMVRVMEVFIiAQLRLLFG I AQ PQLPPKCI*I*SGPTS EGLMT 
WELDRLLWARS VENLATATTTLTS LAQL LG KI SN I VI KDDVAS E 
VYKAVAAVQKSAEELASGHLASAFVASQEAVTSSEIiAFFDPSLi 
HLLYFPDDQKFAI YIPLFLPMAVPI LLSLVKI FLETRKSWRKPE 
KTD 


5812 


5204 


2744 


GGRQRCQRGRSCGAREBEVEPGTARPPPAASAMDASIjEKIADPT 
IiAEMGIWI^EAVKMIiEDSQRRTEEENGKKLISGDI PGPLQGSGQ 
DMVS I LQIiVQNIjMHGDEDEE PQS PRI QNIGECGHMAIiIiGHS LGA 
YISTLDKEKXRKLTTRIIiSDTTLWLCRI FRYENGCAYFHEEERE 
GIiAKICElAIHSRYEDFVVDGFNVLYWKJCPVIYLSAAA^ 
YLCNQI^IjPFPCIjCRVPCNTVFGSQHQMDVAFLEKIjIKIJDI ERG 
RLPIiLVANAGTAAVGHTDKIGRLKELCEQYGI WLHVEGVNIAT 
LALGYVSSSVI^AKCDSMTMTPGPWI^LPAVPAVTLYKHDDPA 
I*TLVAGLTSNKPTDKLRAI^LWLSLQYI*GLDGFVERI khacqls 
QRLQESIiKKVNY I KI LVEDELS S PVWFRFFQEL PGSDPVFKAV 
PVPhWPSGVGRERHSCDALNRWlX3EQLKQLVPASGLTVMDLEA 
EGTCIiRFS PLMT AAVLGTRGED VDQ L VA C I ES KLP VLCCTLQLR 
EEFKQEVEATAGLLYAmDPNWSGIGVWYEHANDDKSSLKSYPQ 
GENIHAGIJ^KKLNEI^SDI*TFKIGPEYKSMKSCI*YVGMASDNVH 
AAELVETIAATAREI EDNSRLLENMTE WRKGIQEAQVELQKAS 
EERLLEEGVliRQI PWGSVLNWFSPVQALQKGRTFNX»TAGSI,ES 
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SEQ 
ID 
NO- 


Predicted 
beginning 

nil i— 1 f"kt" t 

location 
corre spending 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 

J. UL>d LXwli 

corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A= Ala nine, C=Cysteine, D=Aspartic Acid, E= 

uXULduLi. C nClu, r —irncnyiaXaiiXIlS, \J s= vyXytJXIie , 

H-Histidlne, I=lsoleucine, K= Lysine , 
L= Leucine, M=Methionine, N-Asparagine , 
P=Proline, Q=Glut amine, R=Arginine, 
S-Serine, T=Threonir.e ( V= Valine, 
W-Tryptophan, Y=Tyrosine, X=Unknown , *=Stop 
Codon, /=possii>le nucleotide deletion, 
\-possible nucleotide insertion) 








TEP I YVYKAQGAGVTbP PTPSGSRTKQRLPGQKPFKRSLRGS DA 
LSETSS VS H I EDLEKVERLS SG P EQ I TL. EASSTEGH PG APS PQH 
TDQTEAFQKGVPHPEDDHSQVEGPESLR 


5813 


2936 


699 


HRDGVSGSLERPUTDRS RTGAFAQQRGKMATAGGGSGAD PGSRG 
LWU^FCVT^IiAGLCRGNSVERKrYIPnNKTA 
GCQSSISGDTGVIHVVEKEEDI.QWVIiTDGPNPPYMVIJiESKH^ 
RDLMEKLKGRTSRIAGIiAVSLTKPS PASGFS PS VQCPNDGFGVY 
SNSYGPEPAHCREIQWNSLGNGIAYEDFSFPIFIJLEDENETKVI 
KQCYQDHNI*SQNGSAPTFPLCAMQLFSHMftWLS FSTAT \ CMRRS 
S IQS TFS INPKI VCDPLSDYNVWSMLKP INTTGTLKPDDRWVA 
ATRJjDSRS F FWNV \APGAaS AVAS FVTQIiAAAEAXiQKAPD V JL 1 X> 
PRNVMFVFFQGETFD YIGSSRMV YDMEKGKFPVQIjENVDS FVEIi 
GQVALRTSIjEIjWMHTD P VS QKNES VRNQVEDLIiATIjE KSGAGVP 
AVZliRRPNQSQPLPPSSLQRFLRARNISGVVLADHSGAFHNKYY 
QSI YDTAENINVS YPEWLEPLKE /ETWNFG* QDTAKALADVATV 
IX5RAIiYEIAGGTNFSDTVQADPQTVTRLLYG\FLIKANNSWFQS 
II^RDIJISYI^*RGI,FX3H\YIAV\SSPTOTIYV/VI J QYALANI* 
TGTWNLTREQCQDPS KVPSENKDL YEYSWVQG PLHSNETDRLP 
RCVRSTARLARALSPAFELS QW S S TE YSTWTES RWKDI RAR I Fl» 
IAS KELELI TLTVGFG I IjI FSL I VTYCINAKADVLFIAPREPGA 
VSY 


5814 


8500 

• 


432 


ALKCRPRRVLAI LVGP VQPDRMAE EG AVAVCVRVRPLNSREES L 
GETAQVYWKTHNNVI YPVDGS KSFNFDRVLHGNETPKNVYEA\I 
AAPI IDSAIQGYNGTIFA\YGQT\ASGKTYTMMGSEDH1»GVIPQ 
GQFHGHFSQKI + EVFLDRE FLLRVS YME I YNBT ITDLLCGTQKM 
KPLI IREDVNRNVYVADLTEEWYTSEMALKWITKGEKSRHYGE 
TKMNQRSSRSHT I PRMI IiESREKGEPSNCEGSVKVSHIaNIiVDlA 
GSERAAQTGAAGVRDKEGCNINRSLFIIiGQVIKKIiSDGQVGGFI 
NYRDS KLTR I LQNS LGGNPKTRI I CTITPVS FDE TL1TAX1 QFAST 
AKYMKNTPYVNEVSTDEAI^KRYRKE I MDL.KKQLE EVS LETRAQ 
AMEKDQIiAQLLE E KDI*LQKVQNEK I ENLTRML VTSSSLTIiQQEI* 
KAKRKRRVTWCLGKINKMKNSNYADQFNI PTNITTKTHKLS INI* 
LRE I DESVCSESDVFSNTI*DTI*S EI EWN PATKLI»NQEN IBS ELN 
SLRADYDNIjVLDYEQLRTEKEEMEIjKIjKEKNDLiDEF^ 
KDQEMQL I HE I SNLKNLVKHRE VYNQDL.ENELS S KVELLRE KED 
QIKKLQEYIDSQKLEfrlKMDLS YSLES I EDPKQMKQTLFDAETV 
AIjDAKRESAFLRS ENbELKEKMKELATT YKQMEND I QLYQS QIjE 
AKKKMQVDLEKELQSAFNE I TKLiTS L I DGKVPKDLLCNLELEG K 
I TDIQKELNKEVEENEALRSE VIt»LSEIiKSI*PSEVERLRKE IQD 
KSEELHI ITSEKDKIiFSEWHKESRVQGLLEEIGKTKDDLATTQ 
SNYKSTDQEFQNFKTLHMDFEQKYKMVI .EENERMNQE I vnls ke 
AQKFDS SLGALKTELS YKTQELQEKTREVQERIjNEMEQI>KBQLE 
NRDS PLQTVERE KTL» I TE KI.QQT L EEVKTIiTQEKDDLKQLQE S L> 
QIERDQLKSDIHDTVNMNIDTQEQLRNAIjESLKQHQETINTLKS 
KI SEE VSRNLHMEENTGETKDEFQQKMVG I DKKQDLEAKNTQTI> 
TADVKDWEI IEQQRKI FSLIQEKNEXQQMLESVIAEKEQLKTDL 
KENIEMTI ENQEELRLIX3DELKKQQEIVAQEKNHAI KKEGELSR 
TCDKLAEVEEKLKEKS QQLQEKQQQLLNVQEEMSEMQKKINE I E 
NLKNELKMKELTLEHMETERLEI^QKI^ENYEEVKSITKERKVL 
KELQKS FETERDHLRG YI RE I EATGLQTKEELKIAH I HLKEHQE 
TTnPIjRRSVS EKTAOI INTODIjEKS HTKLOEE I PVLHEEOELIiP 
NVKKVSETQETMNEI^IjIjT^QSTTKDSTTLARIEMERLR]^ 
QE5QEE I KSLTKERDWLKTI KEALE VKHDQLKEH IRETLAKIQE 
SQSKQEQSLNMKEKDNETTKI VSEMEQFKPKDSALLR IE I EMLG 
LSKRLQESHDEMKSVAKEKDDLQRLQEVLQSESDQLKENI KE I V 
AKHLETEEELKVAHCCLKEQEETINEIjRVNIjSEKETEISTIQKQ 
LEAINDKLQNKI QE I YEKEEQIjN 1 KQ IS EVQEICVNELKQ FREER 
KAKDSALQS lESKMLELTNRIiQESQEEIQIMlKEKEEhlKRVQEA 
I^IERDQLKEOTKEIVAKMKESQEKEYQPIjKMTAVNETQEKMCE 
lEHLKEQFETQKLNLE^IETENIRLTQII^HEin^EEMRSVTKERD 
DIiRSVEETI*KVERIX5I»KENLRETITRDLEKQEEIJCIVl^^ 
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Amino acid segment containing sxgnal peptide 
<A= Alanine, C= cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, l=lsoleucine, K= Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P= Proline , Q=Glutamine, R=Arginine, 
S-Serine, T=Threonine, V=Valine, 
W-Tryptophan, Y-> Tyrosine, X=Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\ -possible nucleotide insertion) 






• 


QETIDKLRGIVSEKTNEISNMQKDLEHSNDAI*KAQDLKIQEELR 
IAHMHLKEQQET I DKliRG IVS E KTDKLSNMQKDLENSNAKLQE K 
IQELKANEHQLITLKIOVNETQKKVSEMEQLKKQ I KDQSLTLS K 
LE I ENLNIAQKI^ENl^EMKS VMKERDNLRRVEETLKLERDQLK 
ESI^ETKARDI£IQ#ELKTARMI*SKEKKETVDKL^ 
ISDIQKDIjDKSKDELQKKIQEIiQKKEIjQIJJIVKEDVIWSHKKIN 
EMEQI^QFEPNYLCKCEMDNFQLTKKIjHESLEEIRIVAKERDE 
LRRIKESLKMERIXJFIATIjREMIARDRQNHQVKPEKRLLSDGQQ 
HLMESLREKCSRIKELI^YSEMDDHYECIJ^RLSLDLEKEIEFH 
. RIMKKLKYVL^YVTKIKEEQHECINKPEMDFIDEVEKQKELLIK 
IQHLQQD CDVPSRE LRDLKLNQNMDIiH I EE I LKDFS ES E FPS I K 
TEFQQVIiSNRKEMTQFLEEWLNTRFlJIEKLKNGIQKENDRICQV' 
^FFNNRIIAIMNESTEFEERSATISKEWEQDLKSIsKEKNEKIiF 
KNYQTLKTSI^GAQVNPTTQDNKNPHVTSRATQLTTEKlREIiE 
NSLHEAKESAMHKESKI I KMQKELEVTND 1 1 AKLQAKVHESNKC 
LEKTKETI QVTiQDKVALGAKPYKEE I EDIiKMKLGKIDLEKMKNA 
KEFEKEI SATKATVEyQKEVIRLLRENLRRSQQAQDTSVISEHT 
D PQP SNKPLTCGGGSG I VQNTKALI LKSEH I RLEKE I S KLKQQN 
EQLI KQKNEI^NNQHLSNEVKTWKERTLKREAHKQVTCENSPK 
SPKVTGTASKKKQITPSQCKERNLQDPVPKESPKSCFFDSRSKS 
LPSPHPVRYFDNSSLGLCPEVQNAGAESVDSQP\GPWARLFQGK 
DVP\ECKTQ 


581S 


23 


1460 


S ELVMWTVQNRES LGIjLS FP VW I TMV CCAHS TN E PSN MS Y V KET 
VDRLLKG YD I RLRPD FGG P PVDVGMR I DVAS I DMVSEVNMD YTI> 
TMYFQQSWKDKR1»SYSGI PLNLTLDNRVADQLWVPDTYFLNDKK 
SFVHGVTVKNRI^IP^HPDGTVLYGLRITTTAAO^MDLRRYPLDE 
CNC7LEIES YGYTTDDI EF YWNGGEGAVTGVNKI ELPQFS I VDY 
KMVSKKVEFTTGAYPRI^LSFRLKRNIGYFILQTYMPSTLITII* 
SWVS FWINYDASAARVAiGITTVLTMTTISTHliRETLPKI P YVK 
AIDIYliMGCFVFVFIiAIiliE YAFVNY I F FGKG PQKKGAS KQDQSA 
NEKNKLEMNKVQVDAHGNII*LSTLEIRNETSGSEVIiTSVSDPKA 
TMYSYDSAS I QYRKPLS S RE \A* GRAP3RHG VPS KGR I RRRAS \ 
QLKVKIPDLTOVNSIDKWSRMFFPITFSLFNVVYWLYYVH 


5816 


861 


191 


TSSRSRAAAQEGDAETPGS VERRGRRAGAEDGMSQAPGAQPS PP 

TVYHERQRLELCAVHALN]fVLQQQI*FSQEAA^ 

NPHRS LLGTGNYD VNVIMAALQGLGLAAVWWDRRR PLSQLALPQ 

VLGLILNLPS PVSLGI^LPLRRRHLRWPCARL/VTVSYYNLDS 

K\LRAPEGPGGI»RTE\ *G PFlAAAIiAGGIiCEVIiLVVTKEVE EKG 

SWLRTD 


5817 


851 


118 


RLFRGPGANRGRSCRGCSGGREPSGGALPKRHCPC*PPSPPAAD 
VMSNTTVPNAPQANSDSMVGYVIXSPFFLITLVGV^ 
KKRVDRLRHHLIjPMYS YDPAEELHEAEQE LI*SDMGDPKVV\ QAG 
RVATSTSG CHCWMS RRDLTPLPHPS EPGVLDCIiG P CHLLPLLS P 
GSPCWVLGLHFSLHPPSAASASHALTITSLPPGIiLPFVGVELTA 
HPCALMGRGF P S GMAAAGRHLCFt* 


5818 


3 

• 


3918 


QALRDKLW IFI»VQS F YA VRHTES WKLMSTDDQQKI QAAAFDKG D 
DRRLGKKP I FSS SQQRKQVSDSGDIKI KS WRGNNKKECWSYLST 
NKKMKS DGLGASGHS SSTNRNS INKTLKQDD VKEKDGTKI AS K I 
TXELKTGGKNVSGKPKTVTKS KTENGDKARLENMS PRQWERSA 
TAAAAATG Q KNL LNG KG VRNQEGQ I SGARP KVLTGN LNVQ AKAX 
DT.v-iraTf:imQ T>rT,<z T AGP^RSTDSSME FS I STECLDEPKENGS 
TEEEKPSGHKLS FCDS PGQMMKNSVDSVKNSTVAI KSRPVSR VT 
NGTSNKKS I HEQDTNVNNSVLKKVSGKGCS EP VPQAI LKKRGTS 
NGCTAAQQRTKSTP SNLTKTQGS QGES PNS VKSSVSSRQS DENV 
AKLDHNTTTEKQAPKRKMVKQVHTAI^KVNAKIVAMPK^ 
KGETLNNKDSKQKMPPGQVISKTQPSSQRPLKHETSTVQKSMFH 
DVRDNNNKDSVSEQKPHKPLINliASEISDAEAIiQSSCRP\DPQK 

PLNDQEKEKLALECQNISKI^KSLKHELESKQICL^ 
HKETDDCDAANXCCHSVGSDNVNSKFYSTTAI^KYMVSNPNENSI* 
NSNPVCDLDS TSAGQIHLISDRENQVGRKDTNXQSS I KCVEDVS 
LCNPERTNGTIJMSAQEDKKSKVPVEGLTI PS KX»SDESAMDEDKH 
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Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F-Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K= Lysine, 
L=Leucine, M=Methionine, N-Asparagine, 
P-Proline, Q=Glut amine , R=Arginine, 
S=Serine, T= Threonine , V=Valine, 
W= Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) j 








ATAJDSDVS S KCFSGQLSEKKS PKNMETSES PESHETPETP FVGH 
WNI*STGV3oHQRESPESDTGSATTSSDDI KPRSEDYDAGGSQDDD 
GSNDI^ISKCGTMLCHDFIXsRSSSDTS^PEEIiKIYDSNliRIBVK 
MKKQSSNDLFQVNSTSDDEIPRKRPBIWSRSAIVHSRERENIPR 
GS VQFAQE I DQVSSSADETBDERS EAENVAENFS I SNP APQQFQ 
GI INI^EI1ATE3^CREFSANKKFKRSVLLSVDECEELGSDEGE 
VHTPFQASVDSFSPSDVFDGISHEHHGRTCYSRFSRESEDNILE 
CKQNKGNSVCKNESTVLDLSS I D S S RKNKQS VSATE KKNT I D VL 
SSRSRQLI>REDKKVNNGSNVF^roiQQRSKFLDSDVKSQER^ 
DLHQREPNS DI PKNS ST KSLDS FRS QVLPQEGP VKESHS TTTEK 
ANIAI^AGDIDDCDTLAQTRMY DHRP SKTLS PI YEMDVI EAFEQ 
KVESETHVTEMDF* DDQHFAKQDWTI»IjKQLI*SEQDSNLDVTN S V 
P EDLSLAQ YL XNQTIjLIiARDS S KPQG 1TH I DTLNRWS EliTSPLD 
SSAS I TMAS FSS EDCS PQGEWT I IiELETQH 


5819 


1 


5557 


AAAGLLGAUU.VMTLVVAAARAEKEAFVQSES 1 1 EVLRFDDGGL 
LQTETTLGLSS YQQKS I SLYRGNCRPI RFEPPMLDFHEQPVGMP 
KMEKVYLiHNPSS E * T I TLVS I FATTSHFHAS FFQNRKI 1» PGGNT 
S FDVS / VFIiARVVGNVENTLF INTSNHGVFTY\QVFGVGVPNP Y 
RI^PFIjGiUlVTVNSSFSPIINIHlIPHSEPLQVVEMYSSGGDIsHI, 
ELPTGQQGGTRKLWEI PP YETKGVMRASFSSREADNHTAFIRIK 
TNASDSTEFI I L P VEVEVTTAPG I YS STEMLDFGTLRTQDL PKV 
LNLKO,I^SGTKDVPITSVRPTPQ\OT3AITVHFXPITLKAS\ESK 
YTKVASISFDASKAKKPSQFSGKITVKAKEKSYSKLjEIPYQAEV 
LDGYLGFDHAATLFHIRDSPADPVERPIYLTNTFSFAILIHDVIj 
LPEEAKTMFKVHNFSKPVLI I>PNESG Y IFTLLFMPSTSSMHIDN 
NI 1*1*1 TNASKFHLPVRVYTGFTJ>YFVLPPKI EERF IDFGVLSAT 
EASNILFAI INSNPI EIAI KSWHI IGDG\LS 1 ELVAVDRGNRTT 
1 1 SSL PECEKSS S SDQS S VTLjASG YF \AVFRVKLTAK KI* \ EG IH 
DGAIQITTDYEILTI PVK\AVIAVGSliTCSPKHVVLPPS FPGKI 
VHQSI^IMNSFSQKVKIQQIRSLSEDVRFYYKRJbRGNKEDLEPG 
KKSKIANIYFDPGLQCGDHCYVGLPFXSKSEPKVQPGVAMQEDM 
WDADWDIiHQSLFKG WTG I KENSGHRLSA I FEVNTDLQKN I 1 SKI 
TABLSWPS ILSSPRHLKFPLTNTNCSS \EEEITLENP/SQDVPV 
YVQFI PLALYSNP SVFVDKLVS RFNLS KVAKIDLRTLiEFQVFRN 
SAHPLQSSTGFMEG\l>SPmiI*lNrLILKPGEKKSVKVK\FTPV^ 
RTVSSLI I VTONLTVMDAVMVQGQGTTENLRVAG KL PGPGS S LR 
FKITEALIiKTKrTDSLKLREPNFTIiKRTFKVFJOTGQLQIHIETIE 
ISG YS CEG YGFKVVNCQEFTLS ANASRD I 1 IIiFT PDFTAS RVI R 
ELKF I TTSGSEFVF I LNASLPYHMIATCAEALPRPIWEI1AI1Y 1 1 
ISGIMSALFI*I*VIGTA\YI»EAC^IWBP\FRRRLS\FEASNPPFD 
VGRPFDI*RRI VG I S S EGNI*NTLS CDPGHSRGFCGAGGS 5S RPSA 
GSHKQ* GP SGHPHS SHSISR^S ADVDDVRAYNSGRTS SMTSAQAA 
SSQPANKTRPLVLDSNTGAQGHSAGRKSKGAKQSQHGSQHHAHS 
PLEQHPQPPLPPPVPQPQEPQPERIjSPAPLAHPSHPERASSARH 
SSEDSDITSLiIEAMDKDFDHHDSPALEVFTEQPPSPLPKSKGKG 
KPLCRKVXPPKKQEEKEKKGKGKPQEDELKDSLADDDSSSTTTE 
TSNPDTEPLLKEDTEKQKGKQA14PEKKESEMSQVKQKSKKLLNI 
KKEIPTDVKPSSLELPYT^PIjESKQRRNljPSKIPIiPTAMTSGSK . 
SRNAQKTKGTSKLVDNRPPAtiAKFI*PKSQELGNTSSSEGEKDSP 
PPEM)S VPVHKPGSS TDSIj YKLSIjG/ri*NAD I FLKQRQTS PTPAS 
PSPPAAPCPFVARGSYSSIVNSSSSSDPKIKQPNGSKHKLTKAA 
S L PGKIsTGN P T FAAVTAG YD KS PGGNGFAKVSSNKTGFSSSLGI S 
HAP VDSDGSDSSGLWS PVSNPSS PDFTPLNSFSAFGNSFNLTGE 
VFSKLGLSRSCNQASQRSWNEFNSGPSYLWESPATDPSPSWPAS 
SGSPTlITATSVIX31^SGLWSTTPFSSSIWSSm*SSALPKrTPAN 
TliAS IGLMGTENS PAPHAPSTSS PADDLGQTYNP WR I WS PTIGR 
RSSDPWSNSHFPHEN 


5820 


310 


1270 


R VS LSGPVS I^VLliCARSSTMGKRDNRVAYMNP I AMARSRGP I Q 
SSGPTIQ\ VI * IDQGLPGKK* KSN*KRKRK/DSKALAEFEEKMN 
ElWKKEIiEKHREKLLSGSESSSKKRQRKKKEKKKSW* \DSSSS\ 
S SS SDSSS S SS DSEDEDKKQGKRR KKKKKRSHKSSESSMS EXES 
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Amino acid segment containing signal peptide 
(A= Alanine, C- Cysteine , D=Aspartic Acid, E=- 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
K=Histidine, 1-Isoleucine, iO=Lysine, 
L»Leucine , M=Methionine, fi=Asparagine , 
P= Proline, Q»Glut amine, R=Arginine, 
S-Serine, T^Threonine, V=Valine, 
W=Tryptophan, Y -Tyrosine, X= Unknown , *=stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








DSKDSLKKKKKSKDGTEKE KDIKGLSKKRKMYSEDKPLSS ESLS 
ES E YIEEVRAKKKKSS E EREKATEKTKKKKKHKKHSKKKKKKAA 
SSS PDSP*E* EKSGFPYKESAMSEE ISTVKTTTYIiLKCMNFLVF 
GI I PGliFSSHSDATV 


5822 


179 


915 


KWRNQS WRW PKPGTN W M1>S CS VCWRRVTWTGS VWMRKbG KHPQT 
PT/I KDCS IAATGKRPSARFPHQRRKKRREMDDGLAEGG PQRSN 
TYVI KLFDRSVDIAQFSENTPLYP ICRAWMR^IS PSVRERECSPS 
SPI*PPLPEDEEG\ SEVTNSKSR* CVQACPPTHTPGGQPKNACR\ 
SRIPSPLAALRMQGTP * RWS PFEPEPS PSTLI YRNMQRWKRI RQ 
RWKEASHRNQLRYSESMKI LREMYERQ 


SB22 


464 


4379 


QTLKEM PI VMARDLEETASSSEDEEV I SQEDHPCI MWTGGCRRI 
PVLVFHADAI t*TKDNNIRVrGERYHLSYKIVRTDSRLVRS ILTA 
HGFHEVHPSSTDYNL^TGSHLKPPIJ^TI^EAQKVNHFPRSYE 
LTRKDRLYKNI I RMQHTHG FKAFH ILPQTFLLPAE YAEFCNS YS 
KDRGPWIVKPYASSRGRG\VY1.INNPNQISLEENILVSRYINNP 
IiLIDDFKFDVRIjYVLVTSYDPIjVIYIiYEEGJjARFATVRYDQGAK 
NIRNQFMHIiTNYSVNKKSGDYVSCDDPEVEDYGNKWSMSAMLRY 
LKQBGRDTTALMAHVEDL 1 1 KTI ISAEIAIATACKT FVPHRS S C 
FELYGFDVLIDSTIiKPWI*I»EVNLSPSIiACDAPLDIiKIKASMISD 
MFTVVGFVCQDPAQRASTRPIYPTFESSRRNPFQKPQRCRPLSA 
S DABM KNLVG SARE KGPGKLGGS VLGLSMEE I KV1.RRVKE ENDR 
RGGFIRI FPTSETWEIYGSYLEHKTSMNYMLATRLFQDRMTADG 
APEXtKI * S I*NS KAKLHAAIiYERKLLS LEVRKRRRRS S RLRAMRP 
KYPVITQPAEMNVXTETBSEEEEEVALDNEDEEQEASQEESAGF 
LRENQ AKY TPS LTAL.VENTP KENSMKVREWNNKGGHCCKL E TQE 
LEPKFNIjMQILQDNGNLSKMQARIAFSAYLQHVQI\RIjMKDSGG 
QTFS AS WAAKEDEQMELVVRFLKRASNNIjQHSLRM PSRPJLAI* 
LERTRILiAHQLGDFIIVYNK^TEQMAEKKSKKKVEEEEEDGVNM 
ENFQEFIRQASEAELEEVLTFYTQKNKSASVFLGTHSKISKIWN 
NYSDSGAKGDHPET IMEEVKI KPPKQQQTTEIHSDKLSRFTTSA 
EKEAKLVYSNSS S G PTATLQKI PNTHIiSSVTTSDIjSPGPCHHSS 
LSQIPS AI PSMPHQPTI LLNTVSASAS PCbHPGAQNI PS PTGLP 
RCRSGSHTIGPFSSFQSAAHIYSQKLSRPSSAKAGSCYIiNKHHS 
G IAKTQKEGEDASIiYS KRYNQSMVTAELQRLAEKQAARQ YS PSS 
HINLLTQQVTNLNLATGI INRSSASAPPTLRPI IS PSGPTWSTQ 
SDPQAPENHSSS PGSRSLQTGGFAWEGEVENNVYSQATGVVPQH 
KYHPTAGSYQLQFAX.QQLEQQKLQSRQLLDQSRARHQAI FGSQT 
LPNSNLWTMNNGAGCR I SS ATASGQKPTTLPQKWP P PSSCAS 1» 
VPKPP PNHEQVLRRATSQKAS KGSSAEGQLNGLQS SliNPAAFVP 
I TS STD PAHTKI MNHKHTEKQPVHHSWVHD 


5823 


42 


2293 


LLTAIiSMEGGGGRDEPSACRAGDVNMDDPKKEDI LLIADEKFDF 
DLSLSSSSANEDDEVFFGPFGHKERCIAASLELNNPVPEQPPI.P 
TSESPFAWSPLAGEK^^ETVYKEAHULALHIESSSRNQAAQAAKP 
EDPRSQGVERFI QES KF\ KI NLFE KEKEM KKS PTS LKRETYYLS 
DS PLLGPPVGEPRIOiAS SPA1*PS SGAQARLTRAPG P PHS AHAI>P 
RESCTAHAASQAATQRKPGTKI.LLPRAASVRGRGI PGAAEKPKK 
EIPASPSRTKIPAEKESHRDVLPDKPAPGAVNVPAAGSHLGQGK 
RAIPVP\KKI*GI>KKTIjLKAPGSYSN\LQRKSSSGA\VWSGASSA 
CTPQP VAKAKS SE FAS I PAN* LPGLCPNI S KS \GRMG PAMLRPA 
I*\PAGPVG\ASSWQAKRVDVSEIiAAEQLTAPP\SAS ptqpqtpe 
GGG \QWIiNS S CAWS ES S Q JJIKTRS I RRRDS CLNS KTK.VMPTPTM 
QFKIPKFS IGDS \ P DS S TPKL SRAQRPQS CTS VGRVT VHS TP VR 
RSSGPAPQSIiLSAWRVSALPTPASRRCSGLPPMTPKTMPRAVGS 
PL\CVPARRRS SEPRKNSAMRTE PTRESNRKTDSR \ LVDVS PDR 
GSPPSRVPQALNFSPEESDSTFSKSTATEVAREEAKPGGDAAPS 
E ALIiVDIKJUS PLAVTPDAASQPL I DLPLI DFCDTPEAHVAVGSE 
SRPLIDU4TOTPDMNKNVAKPSPWGQLIDLSS PLIQX»SPEADK 
ENVDSPLIxKF 


5824 


42 


2293 


LLTAIiSMEGGGGRDBPSACRAGDVNMDDPKKED ILLIiADEKtUJ? 
DLSLSSSSANEDDEVFFGPFGHKERCIAASLELNNPVPEQPPLP 
TS BS P FAWS P LAGEECFVKVYKEAHLLALH I ES S S RNQAAQ AAKP 
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amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine» C=Cysteine, D-Aspartic Acid, E« 
Glutamic Acid, F-Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K»Lysine, 
L= Leucine, M-Methionine, N»Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T-Threonine, V= Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknovm, *=stop 
Codon, /=possible nucleotide deletion, 
\=possil>le nucleotide insertion) 








EDP RSQG VERFIQES KF \K1 NLFEKE KEM KKS PTS LKRETY Y1*S 
DSPIJLGPPVGEPRLLASS PALPS SGAQARLTRAPGPPHSAHALP 
RESCTAHAASQAATQRKPGTKLLLPRAASVRGRGI PGAAEKPKK 
E I PAS P S RTKI PAE KE SHRDVLP DKF APGAVNVF AAG S HLGQGK 
KAI PVP \NKLGLKKTLLKAPGS y SN\LQRKSSSGA\VWSGASSA 
CTPQPVAKAKSSEFAS I PAN* LPGLCPNI SKS \GRMGPAMLRPA 
L\ PAGF VG\ASSWQAKR VDVS ELAAEQIiTAP P \SAS PTQPQTP E 
GGGXQWIJtfSSCAWSESSQLNKTRSIimro 

QFKIPKFSIGDS\PDSSTPKLSRAQRPQSCTSVGRVTVHSTPVR 
RSSGPAPQSLI^AWRVSALPTPASRRCSGLPPMTPKTMPRAVGS 
PL\CVPARRRSSEPRKNSAMRTEPTRESNRKTDSR\LVDVSPDR 
GSPPSRVPQALNFS PEESDSTFSKSTATEVAREEAKPGGDAAPS 
EALLVDI KLEPLAVTPDAASQPLI DLPLIDFCDTPEAHVAVGSE 
SRPLIDLMTNTPDMNKNVAKPS P WGQLIDLS S PLI QLS PEADK 
ENVDSPLLKF 


5825 

■ 


2 


4210 

» 


FLQI ESAS PAPFSSGFLAAH PHS PGGSLATKGRSRLSAPGMLHL 
SAAPPAPPPEVTATARPCLCSVGRRGDGGKMAAAGALERSFVEI. 
SGAERERPRHFREFTVCS IGTANAVAGAVKYSESAGGFYYVESG 
KLFS VTRN RF I HWKTS GDTLELMEES LD I NLLNNAI RLKFQN CS 
VTjPGGVYVSETQNRVI ILMLTNQTVHRLLLPHPSRMYRSELVVD 

sqmqs i ftd i gkvdftdpcnyql i pavpg i s pnstastawlssd 
gealfalpcasggif^klppydipgmvsVvelkqssvmqrllt 
gwmptairgdqspsdrpi^ij^vhcvehdafifai^qdhklrmws 

YKE QMCLMVAI5MLEYVPVKKDLRLTAGTGHKLRLAYS ptmglyl 
GIF\MHAPKRGQFC1FQLV^TESNRYSIJDHISSIiFTSQETLIDF 
ALTSTDIWALWHDAF27QTVVKYINFEHNVAGQWNPVFMQPLPEE 
EIVIRDDQDPREMYLQSLFTPGQFTNEALCKALQIFCRGTERNL 
Dl^VJSEIJaCEVTIiAVENELG^SVTEYEFSQEEFRJSnjOXJEFWCaCF 
YACCLQYQEALS HPLALHLNPHTNMVCXjLKKG YIjS FLI PS S LVD 
HLYLLPYENLIiTEDETTI SDDVDI ARDVXCLIKCLRLI EES VTV 
DMS V I MEMS CYNLQSPBKAAEQ I LEDMI T I DVENVMED I CS KLQ 
EIPJ/PIHAIGLLIREMDYETEVEMEKGFNPAQPLNIRMNLTQLY 
GSNTAGYI VCRGVHKI ASTRFLICRDLLiI LQQLLMRLGDAV I WG 
TGQLFQAQQDLLHRTAPLLLSYYLIKWGSECLATDVPLDTLESN 
LQHLS VLELTDSGAIiMANRFVSSPQTIVELFFQEVARKHI I SHL 
FSQPKAPI^QTGIilTWPEMITAITSYLLQLLWPSNPGCLFLECLM 
GNCQ YVQLQDY IQLLHPW CQVNVGSCRFMLGRCYLVTGEGQKAL 
ECFCQAAS EVGKEE FLDRL I RSEDGE IVSTPRLQYYDKVLRLLD 
VIGLPELVIQLATSAI TEASDDW\KSQATL\RTCIFKHHL\DLG 
\HNSQAYGSL * PQI PDSSRQLDCLRQLWVLCERSQLQDLVEFS 
YVNLHNEWGI IESRARAVDIJ«3JNYYBLLYAFHI YRHNYRKAG 
TVMFEYGMRLGREVRTIJ^GLEKQGNCYLAALNCL"RLIRPEYAWI 
VQPVSGAVYDRPGASPKRNHDGECTAAPTNRQIEILELEDLEKE 

LCQTFKLPLTP VFEGLAFKC I KLQFGGEAAQABAWAWLAANQLS 
SVITTKESSATDEAWRIiLSTYLBRYKVQNNLYHHCVINKLLSHG 
VPLPNWL INS YKKVDAAELLRL YLNYDLLDLTP YQVIRICGC 


5826 


3 


871 


ksqllrdhsapp p kp ctsvgamgc* prq/ spkeqqrqlkkqknr 
aaaqrsrqkhtdkadalho^k^slekdnlalrkeiqslcaelaw 
wsrtlhvherlcpmdcascsapgiiiigcwdqaegllgpgpqgqhg 

awae p pvqls pspllfashtgsslqgs s s kls alqps ltaqta 
ppqplelfjiptogiclgsspdnpssaix;larlqsrehkpalsaat 
wqglwdps phpllafpllssaqvhf 


5827 


194 


2287 


GMGSENSALI<SYTIJtEPPFTLPSGI^VYPAVLQDGKFASVFVYK 
RENEDKVNKAAJCVP* * HLKTLRHPCLLRFLSCTVEADGIHLVTE 
RVQPLEVALETIjS S AE VCAG I YD I LLAL I FLHDRGHLTHNN VCL 
SSVFVSEDGHWKLGGMETVCKVSQATPEFLRSIQS IRDPASI PP 
EEMS PE FTTLPECHGHARDAFS FGTLVES LLTILNEQVS ADVLS 
S FQQTLHS TLLNP I PKWRPALCTIJ^HDFFllNDFLEVVNFLKS L 
TLKS EEE KTEFFXFLLDRVSCLS EELIAS RLVPLLLNQLVFAEP 
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Amino acid segment containing signal peptide 
(A= Alanine , C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine , G:=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L= Leucine, M=Methionine, N=Asparagine, 
P-Proline, Q=Glutamine , R^Arginine, 
S=Serine f T= Threonine, V=Valine, 
W- Tryptophan, Y-Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








VAVVKSFLPYLLGPKKDHAQGETPCIjLSPALFQSRVIPVUJQLF 
EVHEEHVRMVIiLSHI KAYVGALSLRF^LKXV\IL\PQVXJjG\LR 
D\TSDSIVAITLHSIAVLVSLLGPEVWGGE3.TKI FKRTAP\SF 
TK\NTDLS LEGDPFSQP I KFP XUGLSDVKNTSEDS ENFPSSSKK 
SBEWPDWSGPE\EPENQTVNI \QIWP\REP\CDDVKSQCTTTJDV 
EESSWDDCEPSSUDTKVNPGGG ITATKPVTSGEQKP I PAUJSIiT 
BESMPWKS SLPQKI SI>VQRGDDADQIEPPKVSSQERPLKVPS EI* 
GLGEEFT I QVKKKP VKDPEMDWFADMI PEIKPSAAFLILPELRT 
EMVPKKDDVSPVMQFSSKFAAAE ITEGEAEGWEEEGELNWEDNN 
W 


5826 


2 


257 


AR EGG S L.GAVAACG E L£ YSCDFCPARPHTS W^TRFVKMEFQAVV 
MAVGGGSRMTDLTSS I P KPLL PVGNKF LIWY PULLER VGFE EV 
I VVTTRDVQKALCAEFKMKMKPD I VCIPDDADMGTADSIjRYIYP 
KLKTDV3jVI*S CDLI TDVALiHE WDLFRAYDAS LAMLMRKGQDS I 
EPVPGQKGKKKAVEQRDFIGVDS TGKRLLFMANEADLDEEL.VI K 
GS ILQKHPRI RFHTGIiVDAHLYCLKKY I VDFLMENG\S ITS IRS 
EL\ I PYLV/RGKQFSSASSQQGTRKEKEGGSKGKRGI.KS FRISY 
SPY* KEANYTGTGAP Y \D \ACW I 


5829 


260 


1259 

* 


PDGRLI VS CSEDKT I KI WDTTNKQCVNNFSDSVGFANFVDFNPS 
GTCIASAGSDQTVKVWDVRVNKLL.QHYQVHSGGVNCIS FHPSGN 
YI» I TASSDGTLKI LDIjIjKGRLI YTI^2GHTGPVFTVS FS KGGELF 
ASGGADTQVLLWRTNFDEIjHCKG LTKRNL KRIjHFDS pphlldi y 
PRTPHPHEEKVETVEDFFLHLLRIil QSLR* S I CRSLLPLLWISF 

llilpo^kpwglcqtrvkrpvdis*tlp*chqnvcqqprkrk 
qkt* vtspvkvk/vsi plavtdalehimeqlnvltqtvs i leqr 
ltlted klkd clenqq klfs avqqks 


583 0 


4496 


3139 

* 


GGKMAAPEERDLTQEQTEKLIiQ FQDLTG I ES MDQCZRHTIiEQHNW 
£i IEAAVQDRLNEQEGV PS VFNP P PS RPLQVNTADHR I YS YWSR 
PQPRGLLGWGYYL.IMIiPFRFTYYTILDIFRFALRFIRPDPRSRV 
TDPVGD IVS FMHS FEE KYGRAHPVFYQGTYSQALNDAKREIiRFI* 
LVYLHGDDHQDSDEFCRNTLXZAPEVISLINTRMIiFWACSTNKPE 
GYR VSQALRENT YPFLAMI MLKDRRE * PV \ VGRLEGkl \QPDDI* 
INQLTFI MDANQTYLVS E RLEREERNQTQVLRQQQDEAYlASIiR 
ADQEKERKKRBERERKRRXKEEVQQQKLAEERRRQNLQEEKERK 
LECI.PPEPSPDDPESVKIIFKLPNDSRVERRFHFSQSLTVIHDF 
I*FSLKESP\EKFQIEA\NFPRR\VLPCIPSEE\WPNPPTliQE\A 
GLSHTEVLFVQDLTDE 


5831 


71 


2897 


FCSKDKCCXiYLPDS INRSKSCTAKPGAHSQDRHAVMDSERQVKD 
TDDIESPKRSIRDSGYIDCWDSERSDSLSPPRHGRDDSFDSLDS 
FGSRSRQTPSPDWLRGSSDGRGSDSESDLPHRKLP0VKKDDMS 
ARRTSHGEPKSAVPFNQYLPNKSNQTAYVPAPLRKKKAEREEYR 
KSWSTATS PAGIjGKKAIjQD YGPRT \ P VS \DDAESTSMFDMRC3E 
EAAVQPHSRARQEQLQLINNQLREEDDKWQDDLARWKSRKRSVS 
QDI*I KKEEERKKME KLLAGEDGTSERRKS I KTYRE I VQEKERRE 
RELKEAYKNARSQE EAEG I LQQYI ERFT I SEAVUBRIJSMP K ILE 
RSHSTEPNI>SSFI*NDPNPMKYLRQQSLPPPKFTATVETTIARAS 
VLDTSMSAGSGSPS KTVTPKAVPMLTPKPYSQPKNSQDVLKTFK 
VDGKVSVNGET VHREEE KERECPTVAP AHSLTKS QMFEG VARVH 
GSPLELKQDNGS I E INI KKPNSVPQELAATTEKTEPNSQEDKND 
GGKSRKGNIEIiASSEPQHFTTTVTRCSPTVAFVEFPSSPQIiKND 
VSEEKDQKKPENEMSGKVEIjVI^QKVvXPKSPEPEATLiI. FPr liD 
KMPFJHtf QLHLPNLNSQ VDS PSS EKS P VTTPFKF WAWDPEEERRR 
QEKWQQEQBRliLQERYQ\KEQDK\I*KEE\WEKAQKEVEEEERRY 
YEEEP* I 1 \EDPW?FTVSSSSADQLSTSSSMTEGSGTMNKIDL 
GNOQDEKQDRRWKKS FQGDDSDLLI>KTRESDRLEEKGSI*TEGAL 
AHSGNP VS KGVHE DHQLDTEAG APHCGTN PQLAQDPS QNQ QTSN 
PTHSSBDVKPKTIjPLDKSXNHQI ES PSERRKS ISGKKLCSS CGI* 
PLGKGAAMI IETLNI»YFH I QCFRCG \ I CKGQIjGDAVSGTDVRIR 
NGKLNCNDCYMRSRSAGQPTTIi 


5832 


2454 


829 


PGRRFRHGSCAFQKQCIMLHICQYFLOGECKFGTSCKRSHDFSN 
SENLEKI»EKI/a^SDIiVSRLPTIYRNAHDIKNKSSAPSRVPPI»F 
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Amino acid segment containing signal peptide 
(A=Alanine, C=Cysceine, D=Aspartic Acid, E=s 
Glutamic Acid, F=* Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Iiysine, 
L=Leucine, M=Methiojiine, N=sAsparagine , 
P= Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine , V=Valine, 
W*> Tryptophan, Y-Tyrosine, X=Unknown, *=Stop 
Codon, /=po3sible nucleotide deletion, 
\=possible nucleotide insertion) 








VPOGT^RP irnc qRCV^PlMTT^nPFGnOIPLYHT R K^r"^ POOR" 

RVHFHIiPY RWQb'LiDRG KWEDLDNMEI*! EEAYCNPKIERI LCSES 
ASTFHSHCI*NFNAMTYGATQARRI^TASSVTKPPHFILTTDMI^ 
YWSDE FGS WQE YGRQGTVHFVTTVS S SDVEKAYIAY/WYTGV* R 
PGSHLEVPGRKAQLRVRFQSLRSEKPGLKHN* KGI*PQTQIR\AP 
QDVTTMQTCNTKF PG P KS I PDYWDSSAIiPDPGFQKITLSSSSEE 
YQKVWNIiFNRTLP FY FVQKIERVQNIiALWEVYQWQKGQMQKQNG 
GKAVDERQLFHGTSAI FVDAI CQQNFDWRVCGVHGTS YG KGS Y F 
ARDAA Y SHHY S KS DTQTHTM FLARVLVGE FVRGNAS FVRP P AKE 
GWSNAFYDSCVNSVSDPSI FVI FE KHQVYPE YVIQYTTS S KPSV 
TPS I LIALGS LFSSRQ 


5833 


170 


3289 


S ILCLLSPC WQFGKP WS I LSSRSRHSPCTKKGWEGMRKHbHT 
RQGHK* VHVEI S KALWVYRDDYFlRHS ISVSAVIVRAWITHKYR 
GRDWNVKWEEKLLHAVAKNYTLLQTI PPFERPFKDHQVCLEWNM 
GYI WNLRANR I PQCPLEISTOVVALI^FPYASSGEOTGIVKKFPRF 
RNFKT,RATRRQRMDYPVFTVSLWLYxjI^CKAWLCGII»YFVDSN 
EMYGTPSVFLTEEGYLH IQMHLVKGEDIAVKTKFI IPUCEWFRX 
D I S FNGGQ 1 WTTS I GQDL KS YHNQT I S FRED FH YNDTAG YF 1 1 
GGSR YVAG I EGFFGPI»KYYRIiRSIiHPAQI FNPLLEKQLAEQ I KX» 
YYERCAEVQE I VS VYASAAKHGGERQEAOnjHWS YIiDLQRRYGR 
PSMCRAFPWEKELKDKH PS LFQAUb EMDLLTVP RNQNES VS E IG 
GKI FEKAVKR1»S S IDGLHQISSIVPFLTDSSCCGYHKASYYLAV 
FYETGLlWPRlXiLQGMLYSLVGGOGSERljSS^I^YKHYQGIDN 
YPI^DWELS YAY YSN1 ATKT PLDQHTLQGDQAYVETIRT »KDDE 1 1* 
KVQTKEEXjD V FMWIjKHEATRGNAA^ p 
EAAIEWYAKGALETEDPAL IYDYArVTiFKGQGVKKNRRLALELM 
KKAAS KGLHQAVNGLiG WYYHKFKKNYA\ KAAKYWL KA\ EE \ MGN 
PDASYNLG^HLDGIFPGVPGRNQlTjAGEYFHKAAGGG^ 
WCSLYYITGNLETFPRDPEKAVVWAKKVAEKNGYLGHVIRKGI^N 
AYI*EGSWHEAI»LYYA/IiAAETGIBVSQTNIiAHI CEERPDIiARRYIi 
GVN CVWRYYNFS VPQ IDAPS FAYL KMGDLYYYGHQNQSQD1*EZjS 
VQMYAQAAIiDGDSQGFFNLAIiLIEEGTI IPHHILDFLEIDSTIiH 
SNNIS I liQEIiYERCWSHSNEES FS PCSIjAWLYLHI/RLLWGAILH 
SALI YFIiGTFLLS I LIAWTVQ YFQSVSASDPPPRPSQASPDTAT 
STASPAVTPAADASDQDQPTVTNNPEPRG 


5834 


17 


4020 


RFRRGGGRVFPGAFPAS PSDSIiGQGNSQGPPRTP KPPRT/QECG 
SAAPGP I PGQSSS * VPLRlbEQIQQKADCPLSLEIiALKPRMAAQV 
!TjBI^Sim>LIiEElUPI*PlXX2PCIEPPPSSl^YQPNFNTNFEDR 
NAFVTG I ARY I EQATVHS SMNEMLEEGQEYAVMLYTWRSCSRAI 
PQVKCNEQ PNR VE I Y^KTVE\niiEPEVTkljMNTEWYFQ ERFC 
GEVRRLCHAERRKDFVS EAYLITIXJKTINMFAVLDRLKNMKCSV 
KNDHSAYKRAAQFLRKMADPQS IQESQNLSMFLANHNK1TQSLQ 
O^LBVISGYEEIJ«ADrv^CVDYYENRMYI,TFSEKHMLLKVMGF 
GLYLMDGS VSN I YKLDAKKR INLS KI D KYFKQLQVVPLFGDMQ I 
ELARY I KTSAH YEENKS RWTCTSSGS S PQYNI CEQMI QIREDHM 
RFISELARYSNSEAAmSSGRQEAOKTDAEYRKIiFDLALQGIXJLL 
SQWSAHVMEVYSWKLVHPTDKYSNKDCPDSAEEYERATRYNYTS 
EEKFALVEVTAMI KGIX3VLMGRMESVFNHAIRHTVYAALQDFSQ 
VTLMEPLRQAI KKKKNVIQS VLQAIRKTVCDWETGHEPFNDPA1* 
RGEKDPKSG*D 2 KVPRRAVGPSSTQLYIWRTMLESLIADKSGSK 
KTLRSSLEGPTILD I EKFHR\ESFFYTII1jINFSETIX?QCCDI>SQL. 
WFREFFLELTMGRRIQFPIEMSMPWILTDH^l^KEASMMEYVL 
YSLDLYNDSAHYALTRFNKQFLYDE IEAEVNLCFDQFVYKIiADQ 
I FAY YKVMAGSI*LIJ5KRl^SECKNOGATIHLPPSiroYBTLLKQR 
HVQLLGRS IDLNRL I TQRVSAAMYKSIiELAIGR FES EDLTS I VE 
LDG1X.E I N RMTHKLIxS R YLTJLDGFDAM FREANHNVSAP YGR I TL 
HVFWELNYDFLPNYCYNGS TNRF VRTVJjP FSQEFQRDKQPNAQP 
QYLHGSKALNIiAYSS IYGSYRNFVGPPHFQVICRIjLGYQGIAVV 
MEELLKVVTCSIJ^C^TII^YVKTIjMEVMPKIC^ 
EFFHHQLKDIVEYMLKTVCFQNLREVGNAILFCLI#IEQSI*SLE 
EVCDLIiHAAPFQNI IjPRVHVXEGERLDAKMKRLES KYAPLHLVP 
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Predicted end 
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location 
corre sponding 
to " first 
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residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A- Alanine , c=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K^Lysine, 
L=I*eucine, M-Methionine, N=»Asparagine , 
P=Prol ine , Q=Glut amine , R=Arginine , 
S=Ser ine , T= Threonine , V=Valine , 
W= Tryptophan , Y=Tyxosine, X=Unknovn, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








LI ERLGTPQQ I AIAREG DI*I#TKERI*CCGL*SMFEVI LTRI RS FLD 
DPI WRGPLPSNGVMHVDE CVEFHRLWSAMQ FVYC I PVGTHE FTV 
EQCFGIX3LHmGCmiVLIX;CX2RRFAVIJ)FC^^ 
EIIKNVPUOCMVERIRKF<3ILNDEIITII^KYI>KSGIX5TCTPVE 
HVRCFQPPIHQSLASS 


5835 ; 


4209 


1904 


SGNI RI^QGSHQIDFQVLHDLRQKFPEVPEVVVSRCKIXJKNNNIj 
DACXAVLSQESTRYLYGEGDLNFSDDSGISGIiRNHWTSLKLDLQ 
SQNIYHHGREGSRMNGSRTI*THSISDGQLQGGQSNSELFQQEPQ 
TAPAQVPQGFNVFGMSS S SGASNSAPHIX3 FHLGS KGTSSLSQQT 
PRFNPIt-TVTIAPNIG^ROTPTSIiHIHG^PPPVLNSPQGKSIYI 
RPYITTPGGTTRQTQQHSGWVSQFNP>INPQQVYQPSQPGPWTTC 
PASNPLSHTSSQQPNQQGHQTSHVYMPI SSPTTSQP PTIHS SGS 
SQSSAHSQYNIQNI3TGPRKNQIEIKLEPPQRNNSSKLRSSGPR 
TSSTSSSVNSQTLNRNQPTVY IAASPPNTDE1MSRSQPKVY I SA 
NAATGDEQ VMRNQPTLF I S TNSGASAASRNMSGQ VS MGPAF I HH 
HPPKSRAIGNNSAT3PRVWTQPNT\EYTFKITVS PNKPPAVS P 
GWS PTFELT^n^IJJHPDHYVETENIHHl»TDPTIiAHVDRISETRK 
LSMGSDDAAYTQDI *RJSNS WJW3MVAHACNSSAIX3GQDGRI I *A 
QEFETS WGNIWRLRIjYRRF*NYAGMVAHTCSPS YSVD * ALIjVHQ 
KARMERIiQRELB IQKKKLDKIjKSEVNEMENNIjTRRRL»KRSNS I S 
QI PSLEEMQQLRSCNRQLQIDIEXXTKE IDLFQARGPHFNPSAI 
HNFYDNI G FVGP VPP KPKDQRS 1 1 KTP KTQDTEDDEGAQWNCTA 
CTFLNHPALIRCEQCEMPRHF 


5836 


361 


2303 


FHITMCGICCSVNFSAEHFSQDLKEDL1.YNLKQRGPNSSKQUjK 
SDVNYQCLFSAHVLKOtRGVLTTQPVEDERGNVFLWNGEIFSGIK 
VSAEENDTQIIiFNYLSSCKNESEILSLFSEVQGPWSFIYYQASS 
HYIiWFGRDFFGRRSIiLWHFSNLGKSFCIiSSVGTQTSGLANQWQE 
VPAS\DFSELILSIJJSFPDAIjFTOCII^NIFlJGRILIiKKMLIA* 
VXFQQTYQHLYQR*QMKPNCILKNLIiFL* I*CCHKLHWRLIAVI 
FPMCKLQERYFKS FLLMYT * KEVIQQFI DVLSVAVKKRVLCLPR 
DENLTANEVLKTCDRKANVAILFSGGI DSMVIATLADRHIPLDE 
P IDJULNVAFIAE EKTMPTTFNREGNKQ KNKCEI PSEE FS KDVAA 
AAADS PNKH VSVPDR I TGRAGLKELQAVS PSRI WNFVE INVSME 
EliQKLRRTRI CHLI RPLDTVLDDSIGCAVW FASRG I GWLVAQEG 
VKS YQSNAXWI/TG IGADEQLAGYSRHRVRFQSHGLEGLNKE IM 
MEI/3RJSSRNIiGRDDRVIGDHGKEARFPFLDENVVSFl.NSLPIV7 
EKANLT1jPRGIGEKLI*LWjAAVEIjGXTA3 AIJiPKRAMQFGS R I A 
KMEKINEKASDKCGRLQIMSI#ENI»SIBKETKI> 


5837 


4792 


903 


NGN AVAQAPVTNCCYIATGS KDQTIR I WS CSRGRGVM I LKLP FL 
KRRGGG I DPTVKjERLWLTLHWPSNQPTQLVS S CFGGELLQWDLT 
QSWRRKYTLFSASSEGQNHSRIVPIDJCPLOTEDDKQIjLLSTSMD 
RDVKCWDIATLECS WTLPSLGGFAYSLAFS SVDIGSLAIGVGDG 
MIRVWNTLS IKNNYDVKNPWG^SVKSKVTAliCWHPTKEGCIAFGT 
DIX3KVGLYDTYSNKPPQISSTYHKK3VYTIA^ 
GDRPSLALYS CGGEG I VLQHN PWKLSG EAFD I NKLI RDTNSI KY 
KLPVHTEI SWKADGKIMALGNEDGSIE I FQ\ IPNLKLICTIQQH 
HiaVOTISWHHE\HGSPAQKLSYL\MPSGSQQCSPFTCHNLKNC 
P* KAAPES PSDPLQSPYRTPPQGHTAQDYPVWAWEPHIH* WEGI* 
VFCFPIDGYSPGCWD\AFPGKEAPVAI FRG\HQGRLLCVAWSPL 
DPDCI YSG\ADDFCVHKWIjTSMQDHSRP PQGKKSIELEKKRUSQ 
PKAKPKKKKKPTIiRTPVKUBS I DGNEEES M KENSG PVENG VS DQ 
EGEEQAREPELPCGLAPAVSREPVICTPVSSGFEKSKVTINNKV 
IIJJCKEPPKEKPETLIKKRKARSLLPLSTSLDHRSKEELHQDCL 
VLATAKHS R E LNE D VS ADVE ERFHLGLFTDRATL YRM I D I EG KG 
HI^GPIPELFHQIiMLWKGDLKGVX.QTAAFJlGEIjTD^VAMAPAA 
GYHVWLWAVEAFAKQliCFQDQYVKAASHIiLS IHKVYEAVELJjKS 
NHFYREAI AIAKARLRP 3DP VLKDLYI*S WGT\njERDGHYAVAAK 
CYliGATCAYDAAKVLAKKGDAAS LRTAAELAAI VGEDELSASLA 
I^CMEXjLLANNWGAQEALQLHESDW 

EKQLSEGKSSSSYHTWNTGTEGPFVERVTAVWKS I FSLDTPEQY 
QEAFQKLQN I KYPSATNNTP AKQIjIiLH I CHDLTLAVLSQQMASW 
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SEQ 
ID 

NO: 


Predicted | 
beginning 
nucleotide j 
location 
corresponding J 
to first 
amino aciu 
residue of 1 
amino acid j 
sequence j 


Predicted end 
nucleotide 
location 
corra spending 
to first 
amino acid 

rp<;idu6 of 

amino acid 
sequence 


Amino acid segment containing signal peptide 
(A= Alanine, C= Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L- Leucine, M«Methionine, N=Asparagine , 
P=Proline, Q^Glutaraine , R=Arginine, 
S=Serine, T=Threonine, V== Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








DEAVQAIJLjRAVVTISYuSGSFTIMQEVYSAF^ 

HQS PATPAFKSLEAFFLYGRLYEF1WSI>SRPCPNSSVWVRAGHR 

TLSVEPSQQIjDTASTEETDPBTSQPEPNRPSBLDLRLTEEGERM 

LSTFKELFS E KHASLQNS QRTVAEVQETLAEM I RQHQKS QLCKS 

TANGPDKNEPEVEAEQPLCSSQSQCKEEKNEPLSLPELTKRIiTE 

ANQRMAKFPESIKAWPFPDVLECCIjVLIjI»IRSHFPGCI»A 

QAQEIjLQKYGNTKTYRRHCQTFCM 


S838 


110 I 


98 


KTMPKLLVTFRDVAIDFSQEEKECIaDPAQRDLYRDVMLENYSNIj 
I SLDLESSCVTKKLSPEKEI YEMES \PSGRIWGNVSTI TFQYNG 
LGDNMECKGN1.EGQVSKSEGLYMCVKI TCEE KATESHSTS STFH 
RII/HYQGKIVKCKECRQGFSYIiSCLIQHEEHHNI*KCSEVNKH 

IHTNEKPYQCNACGKAFIRGSQLTEHQRVHTGEKPYDCKKCGKA 
FSYCSQYTLHQRIHSGEKPYECKDCGKAFILGSQLTYHQRIHSG 
EKPYECKECGKAFILGSHLTYHQRVHTGEKPYICKECGKAFLCA 
SQLNEHQRI HTGE KP YECKECGKTF FRGSQLTYHLRVHS GERPY 
KCKECGKAF I SNSNL I QHQRIHTGEKP YKCKECGKAF I CGKQLS 
EHQRIHTGEKPFECKECGKAFIRVAYIjTQHEKIHGEKHYECKEC 
GKTFVRATQIjTYHQRIHTGEKPYKCKECDKAF/HLWLTII^SEHQ 
RIHRGEKPYECKQCGR/LFIRGSHL/NEHLRTHTGEKPYECKEC 
GRAFS RGSEHT1*HQR I HTGEKP YTCVQCGKDFRCPSQLTQHTRL 
HN*EYSSHKICMHSIALASLDFAHLQEICNPEN 


5839 


1 


2425 


GRPFPRPPRALPRL^LRGRRQDGRWTVDFEECLKD \sprfraal 
EEVEGDVAEIaELKX^XDKLVKLCIAXMIDTGKAFC^AIS^QF^G^ 
RD\lAQNS\NNDA\VVETKFAPSFIiDSLQEMINFHTIL/l»* PNS 
EIN*GHSFX2NFVKEDLRKFKDAKKQFENSQ*KRKKIALVKNAPV 
PSRPASLEL * KP PNILTATRKCFRHIALDYVLQINVTjOSKRRSE 
I LKSMLS FMYAHLAFFHQG YDL FS ELG PYMKDLGAQLDRLVGDA 
AKEKREMEGKHSTIQQKDFSRDDSICLKYNVDAANGIVMEGYLFK 

RASHAEKTWNRRWFS IQNHQVVYQKKF KDNPTWVEDLRLCTVK 
iTf-rT?T\ t ctj r> nv»cctrt re tit , V C rMT jTiT\. Ti <IV IfT ."RO A.W T KAVOTS T VAT 
AYREKDDESEKLDKKSSPSTGSLDSGNESKEKLLKGESALQRVQ 

CI PGNAS CCDCGLADPRWAS INLGITLCIECSG IHRSLGVHFSK 
VRSLTLDTWEPELLKI>MCELGISnDVINRVYEANVEKM 
QRQEKEAYIRAKYVERKFVDKIFI** SLS PP\BQQKK\FVS KSSE 
EKRLS ISKFGP\GDQVRASAQSSVRSNDSGIQQSSDDGRBSLPS 
TVSANSLYEPEGERQDSSMFLDSKHLNPGIJQLYRASVEKN1.PKM 
AEAIAHG ADVNVJANS EENKAT PL I QAVLG GSLVT CEFLLQNG AN 
VNQRDVQGRGPLHHATVLGHTGQVCLFLKRGANQHATDEEGKDP 
LS IAVEAANADI VTLLRLARMNEEMRE SEGLYGQPGDETYQD I F 
RDFSQMASNNPEKLNRFQQDSQKF 


5840 


698 


] 3610 


KHLKLPRQHLTTLWQIS S PRWRS P. QRAFMSALS KTQTQSAPALQ 
GI^SLLQSVTGNPVPASEAASQSTSASPAirrTVYTIKGRKLPSS 
AQPFI PKSFNYSPKrSSTSEVSSTSASKAS IGQSPGLPSTAFKLP 
SNTKGFTATHNTS PAAPPTE VTI CQSSEVSKPKL\ ESESTS PS L 
\EMKI HNFLKGNPG FS VA* NLKHPNPAGSLGSS APS ESHP SDFQ 
RGPTSTS IDNIDGTPVRDERSGTPTQDEMMDKPTSSSVDTMSIjL 
SKI IS PGS STPS STRSPPPGRDESYPRELSNSVSTYRPFGIiGSE 
SPYKQPSDGMERPSSLKDSSQEKFYPDTSFQEDEDYRDFEYSGP 
PPSAMMNLQKKPAKSILKSSKLSDTTEYQPILSSYSHRAQEFGV 
KSAFPPSVRALLDSSENCDRLSSSPGLFGAFSVRGNEPGSDRSP 
SPSKNDSFFTPDSNHNSLSQSTTGHLSLPQKQYPDSPHPVPHRS 

LFS PQNTLAAPTGHPPTSG VE KVLASTIS TTSTIEFKNMLKNAS 
RKPSI5DKHFGQAPSKGTPS3DGVSLSNLTQPSLTATDQQQQBEHY 
RI ETRVS S SCLDLPDS TEBKGAP IETLG YHSASNRRMSGEP I QT 
VES IRVPGKGNRGHGREASRVGWFDLSTSGSSFDHGPSS ASELA 
SLGGGGSGGLiTGFKTAPYKERAPQFQESVGS FRSNS FNS TFEHH 
LP PS PLEHGTPFQRE P VGPSSAPPVPPKDHGGI FS RDAPTHL PS 
VDLSNPFTKEAAlJu^AAPPPPPGEl^IPFPTPPPPPPPGEHSS 
SGGSGVPFSTPPPPPPPVDHSGVVPFPAPPLAEHGVAGAVAVFP 
KDHSSLLG^TLAEHFGVLPGPRDHGGPTQRDLNGPGLSRVRESL 
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SEQ 
ID 
NO: 


Predicted 

beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
correspond ing 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A-Alanine, C= Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Hist idine , I = Isoleucine, K*= Lysine, 
LssLeucine, M=Methionine, N«Asparagine, 
P»Proline, Q=Glutamine, R»Arginine, 
S=Serine, T« Threonine , V=Valine, 
W=Tryp tophan , Y=Tyrosine, X= Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








TLPSHSIjEKLGPPHGGGGGGGSNSSSGPPLGPSHRDTISRSGII 
LRSPRPDFRPREPFLSRDPFHSLKRPRPPPARGPPFPAPKRPFF 
PPRY 


5841 


1908 


762 


GI*RLFLVLTVWPMMKPSWLSRTEPSKRI*LCRTLWCGSGWSSRSY 
TRSMLKMTTS INRRSRTSTKSTRTSARPGLTATVS IGLSDSPTW 
RHCWMTARS CSGEKGGHWAPRQVGVYLL.PGRVGCVSSRVSPSFP 
GIXSIiDSGliARRGSAVSALASGLVEEPMLGP PFHPTPRFKAVSAK 
SKEDLVSQGFTEFTIEDFHNTFMDLIEQVEKQTSVADLLASFND 
QSTSD YI»VVYIjRIjLTSG YliQRES KFFEHF I EGGRTVKE FCQ\QB 
\VEPMCKESDHXHI IALAQGLQRVHPGWEYMGPRPRAATTNPHI 
FP*GLPSPKVYIjIiYRPG\HYDILYKIGIiGSSPLGCPGCPI*tiARA 
LGHCYRGFSWVKMSYFTPFFI*SHDPPPMFY 


5842 


307 


1918 


QEPTADFKLRSTCGCGREKTCPDKPGQLINW FI CSLCVPRVRKL 
WS SRRPRTFJiNL.ljLGTACAI YI*G FLVSQVGRASLQHGQAAEKGP 
HRSRDTAEPSFPEIPLDGTLAPPESQGNGSTI^PNVVYITURSK 
RSKPANIRGTVXPKRR KKHAVASAAPGQEALVG PSIjQ PQEA\ EG 
KLMIi * HLGTLREQXWliRXiESDPGGWCG VRE / WRAGGPDFLQPS S 
RESNXRXYSESAPSWLSKDDXRRHRIiIADSAVAGEjRPVSSRSGA 
RLLVIiE GGAPGAVLR CG PS PCGLLKQPLDMSEVFAFHLDR I L»GI> 
NRTL.PS VSRKAE FIQDGRPCPI I LWDASLS S ASNDTHS S VKLTW 
GTY QQLIiKQ KC17QNGRVPKPESGCTE IHHHE WS KMALFDF1*LQI 
YNRLDTNCCGFRPRKEDACVQNGLRPKCDDQGSAALAHI IQRKH 
DPRHLVFIDNKGFFDRSEDNLNFKLLEGIKEFPASAVYVIJCSQH 
LRQKXLQSLFIJDKGYWESQGGRQGIEKLIDVIEHRAKILITYIN 
AHGVKVLPMNE 


5843 


500 


1453 


GTARLVTCWVLHGQ*VKKPAWEPGVVWI,*Q*RCRPKGWGLGAGM 
R3SRMS QPPQC1»RRAQS S CCHFMVKLLDDGTFM I PG EKVAHTSI* 
DALVTFHQQKPIEPRREIiLTQPCRQKDPANVDYEDLFTJYSNAVA 
EEAAC P VSAPEEAS PKPVL.CHQSKERKPS AEM / RQNNHQGSHFL, 
LiPP KI P S WRDP PETT .KE P QNAPRER P EG P AAAKKP P RHCB LWT 
LGCPE I HGDLRP WDRKRQPRSLRGSHLGGQRLHG S I*CGH I S QKP 
LIAPGTKRQKG PHQEGREVGQI*H*GD PRGQEIAPNGS ES P I liPG 
VQARAPGLGRA 


5844 


202 

- 


2471 


FDSAVI^SINVMAVLPGPLQIiI*GVLLTISI,SSlRl»IQAGAYYGI 
KPLPPQIPPQMPPQI PQYQPLGQQVPHMPIiAKDGLAMGKEMPHL 
QYGKE YPHLPQ YMKE I QPAPRMGKEAVP KKGKE I PLASLRGEQG 
PRGEPGPRGPPGPPGLPGHGIPGIKGKPGPQGYPGVGKPGMPGM 
PGKPGAMGMPGAKGEIGOKGEIGPKGI P * POGPPGPHGLPG IGK 
PGGPGLPGQPGPKGDRGPKGLPGPQGLRGPKGDKGFGMPGAPGV 
KGPPGMHGP PGP VGL PGVGKPGVTGFPG P \ QGPLGK\ PGAPGEP 
GPQGPIGVPGVQGPPGIPGIGKPGQDG\IPGQPGFPGGKGEQGI» 
PGLPGPPGIjPGIGKPGFPGPKGDRGMGGVPGALGPRGEKGPIGA 
PGIGGP PGEPGLPGI PGPMGPPGAIGFPGPKGEGG I VGPQGPPG 
PKGEPGIiOGFPGKPGFIiGEVGPPGMRGFPGPIGPKGEHGQKGVP 
GLPGVPGLI^PKGEPGIPG1>0GLQC5PPGIPGIGGPSGPIGPPGI 
PGPKGEPGLPGPPGFPGIGKPGVAGLHGPPGKPGALGPQGQPG1, 
PGPPGPPGPPGPPAVMPPTPPPCGETYLPDMGIGXDGVKPPHAYG 
AKKGKNGG P A YEMP AFTAELTAP FP P VGAP VKFNKLiL YNGRQNY 
NPQTG I FT CEVPG VYY FAYHVHC KGGNVWVAL FKNNE PVM YTYD 
EYKKGF1*DQASGSAVLLLRPGDRVFI^PSEQAAGLYAGQYVHS 
SFSGYLLYPM 


5845 


215 


2061 


HASNKSASI^DKMANPKEKTAMCLVNEIJ^RFmVQPQYKLLNER 

GPAHSKMFSVQLSLGEQTWESEGSS IKKAQQAVGNKADTESTLP 

KPI*KPPKSNVNNNPGCITPTVEIiNGLAMKRG\KPAIHRPLDPK 

PFPNNRANYNFQVMYNQRYHCP I PKI FYVQLTVGNNE FFGEGKT 

RQAARHNAAMKALQALONEPIPERSPQNGESGKDMDDDKDANKS 

EISLVFE IALKRNMPVSFEVIKESGPPHMKS FVTRVSVGEFSAE 

GEGNS KKLS KKRAATT VLQELKKLP PLPWEKP K\H FFKKRPKT 

IVXAGPEYGGGMNPISRlAQICX5AKKEIG2PDYVIiI>SERGMPRRR 

EFVMQVKVGNEVATGTGPNKKXAKKNAAEAML^^ 

DQLEKTGEN KGWSGP KPGFPEPTNNTP KG I LHLS PDVYQEMEAS 



386 



Ducrw^in- ^wr\ 



CI1S3312A1 I > 



WO 01/53312 



PCT/USOO/34263 



SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 

r>riTT»»Qrv^nf^ "i nn 

to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
! nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Ami.no acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenyl alanine, G=Glycine f 
II=Hist idine , I=Isoleucioe, K= Lysine, 
L=Leucine M— Mehhioninp N— A^TvaT-acrini* 
P=Pxoline, Q=Glutamine, R&Arginine, 
S=Serine, T=Threonine , V=Valine, 
W=Txyptophan, Y= Tyro sine, X=Unknown, *=Stop 
Codon, / = possible nucleotide deletion, 
\=possible nucleotide insertion) 








RHKVISGTTLGYIiSPKDMNOPSSS FF<5 I S PT^WS SATT ARKI.T M 
NGTSSTAEAIGLKGSSPTPPCSPVQPS KQLE YLARI QGFQVHYC 
DRQSGKECVTChThAPVQMTFBAZGSS I EASHDQV* YATAILLC 
YGPARKWKA1KKELAMCAHAA1J^1,IHYI>I^ 
N- 


5846 


1126 


456 


FS KLIKKTF I IG I SG VTNSGKTTLAKNIiQKHL PNCS VI SQDDFF 
KPESEIBTDKNGFIiQYDVLEAIiNMSKMMSAISCWM 
TDQ5SAEE!Pil*IIEGFliLFNYKPLDTIWNRSYFLTIPySECKR 
RRSTRVYQ P PDS PGYFDGHVW PM YLKYRQEMQD 1 TWEWYLDGT 
KSEEDLFLQVYEDLIQELAKQKCLQVTA* RRNTTNPS /CK+ IRK 
LOG VI 


5847 


2769 


505 


APEMEDLS S PDSTLLQGGHNLLSSAS FQES VTFKDVI VDFTQEE 
WKQLDPGQPJDLFRDVTLENYTHLVS IGLQVSKPDVISQLEQGTE 
PWIMEPSIPVGTCADWETRLENSVSAPEPDISEEELSPEVTVEK 
h^CRDDSWSSNIjDESWEYEGSI^RQQANQQTLPK^IKVTEKTIPS 
WEKGPVNNEFGKSVNVSSNLVTQEPSPEETSTKRSIKQNSNPVK 
KEKSCKCNECG KAFS YCSALIRHQRTHTGE KP YKCN* /CVEKAF 
SRSENIilNHQRIHTGDKPYKCDOCGKGFIEGPSIiTQHORIHTGE 
KPYKCDECGKAFSQRTHLVQHQRIHTGEKPYTCNECX3KAFSQRG ; 
HFMEHQKIHTGEKPFKCDECDKTPTRS THLTQHQKIHTGEKTYK 
C^ECGKAFNGPSTFIRHHMIHTGEKPYECNECGKAFSQHSNLTQ 
HQKTHTGEKPYDCAECGKSFSYWSSIiAQHLKIHTGEKPYKCNEC 
GKAFS YCSSI*TQHRRIHTREKPFE CSECGKAFS YTiSNLNQHQ KT 
HTQEKAYECKECGKAFIRSSSLAKHERIHTGEKPYQCHECGKTF 
* SYGSSLIQHRKIHTGERPYKCNECGRAFNQNIHLTQHKRIHTGA 
KPYECA3CGKAFRHCSSIAQHQKTHTEEKPYQCNKCEKTFSQSS 
HLTQHQRIHTGEKPYKCNECDKAFSRSTHLTQHQRIHTGBKPYK 
C3aECGK\TFSQSTYLIQHQRIHSGEKPFGCNDCGKSFRYRSAZiN 
KHQRLHPGI 


5848 

* 


22 

• 


2961 


AAPRRIjLiRGGDGDRTPRF PLPALLRPG P PAEAAPERRKMPAVS K 
GDGMRGLAVFI SD I RNCKS KEAEIKRI WKEIiAN I RS KFKGDKAL 
DGYS KKXYVCKLLFIFI>LGHDIDFGHMEAVNLI.SSNRYTEKQIG 
YliFISVIjVNSNSELIRIjINNAJKiroiJVSRi^TFT^GltAIjHCIASV 
GSREMAEAFAGE I P KVLVAGDTMDSVKQSAALCLLRLYRTS PDL 
VPNK3DWTSRVVHLLNDQHLGVVTAATSIJ ITTLAQKNPEEFKTSV 
SLAVSRLS \RIVTSASTDLQDYTY*FCPGFU3LSV}OjX,RIJ j QCY 
PPPDPAVRGRLTECIiETILNKAQEPPKS KKVQHSNAKNAVLFEA 
ISLI IHHDSEPNIiLVRACNQLGQFliQHIlF/rNrjRYIAIiESMCTLA 
SSEFSHEAVKTHIETVINAIiKTERDVSVRQRAVDLLYAMCDRSN 
A?Q IVAEMI>SYX«ETADYS IREE I VLKVA I LAEKYAVDYTW\ YVD 
T I LNL I R IAGD YVS EEVWYR VI Q rVINRDDVQGYAAKTVFEALQ 

CSVPTFJU^I^TYIXTVNLFPEVKPTIQW^ 
QQRAVEYLRLSTVASTOILATVLEEMPPFPERESSIIjAKXiKKKK 
GPSTVTDLEDTKRDRS VD VNGG PEPAPAS TSAVSTP S PSADLLG 
LGAAP PAPAGPP PS SGGSGLLVD VFSDS AS VVAPIiAPGSEDiNFA 
RFVCKNNGVIjFENQLIjQIGLKSE FRQNIjGRMF I FYGNKTSTQFL 
NFTPTLI CSDDLQPNLNT OTKPVDPTVEGGAQVQQWNI ECVSD 
FTEAPVLNIQFRYGGTFQNVSVQLP ITLNKFFQPTEMASQDFFQ 
RWKQLSNPQQEVQN I FKAKHPMDTEVTKAKI IG FGSALLE S VDP 
NPANFVGAG I IHTKTTOI GCLLRLE PNLQAQMYRIiTLRTS XEAV 
SQRLCELI*SAQF 


5849 


3545 


1895 


KRRE I KETVFHHVAQAGLELLSS SN P PSSAS RSAGI TGMRHQVQ 
P*DPCMSLSPPCFTEEDRFSLEAI^IHKQMDDDKXK3GIEVEES 
DEFIRElWKYKDATNKHSHIiHREDKHITIEDLWKRWKTSEVHl^ 
TLEim^WLIEFVELPQYEKNFRDNNVKGTTLPRIAVHEPSFMI 
SQLKI SDRS HRQKLQLKAIjDWLFGPLTRP PHNV7MKDF I LTVS I 
VIGVGGCW FAYTQNKTS KEHVAKMMKDLiES LQTAEQSI1MDI1QER 
IjS KAQEENRNVAVEKQNli*RKMMDEI NYAKEEACRLRE liREGAE 
CELSRRQYAEQELEQWMALKKAEKEFELRSSVJSVPDALQ KWLQ 
LTHEVEVQYYNI KRQNAEMQLAIAKDEAEKI KKKRSTVFGTLHV 
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SEQ 
ID 
NO: 


Predicted 

beginning 

nucleotide 

location 

co rre spond i ng 

to first 

amino acid 

residue of 

amino acid 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A= Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine , 
L=Leucine, M=Methionine , N=Asparagine, 
P= Proline, Q-Glut amine, R=sArginine, 
S=Ssri ne , T=Threonine , V=Val ine , 
W= Tryptophan, Y-Tyrosine, X=UnJcnovn, *=Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








AHSSSbDEVDHKIIiEAIOCAI^ELTTCXRERLFRWQQi. EKI CGFQ 
IAHNSGLPSLTSSLYSDHSWVVMPRVS I PPYPIAGGVDDLDBDT 
PPXVSQFPGTMAKPPGSLARSSSLCRSRRSIVPSSPQPQRAQ1A 
PHAPHPSHPRHPHHPQHTPHSLPSPDPDILSVSSCPALYRNEEE 
EEAI YFSAEKQWEVPDTASECDSLNSS I GRKQSPP /S KPRD I PN 
1 1 S/DERYQEMRCP * RI PSGGIL 


5850 


3 


1895 


KAVliNFSASGS VISLTGSNPMHDASMWHLKKNGI IVYLDVPKLN 
Ij I CRIiKLM KTDRI VGQNSGTSMKDLL, K FRRQ YYKECW YDARVF CE 
SGAS PEEVADKVLNA I KRYQDVDSETFISTRHVWPEDCEQKVSA 
EFFIEAVI EGLiASDGGLFVPAKE FPKL>S CGEWKSLVGATYVERA 
Q I LLERC I H PAD I P AARLG EM I ETA YGEN FACS KXAP VRHLSGN 
QF ILELFHGPTGSFKDLSLQLMPHI FAQCI PPSCNYMI LVATSG 
DTGSAVLNGFSRLNKNDKQRIAWAFFPENGVSDFQKAQI IGSQ 
RENGWAVGVESDFDFCQTAIKRIFNDSDFTGFIjTVEYGTILSSA 
NS INWGRLLPQWYHASAYLDLVSQGFI SFGS PVDVCI PTGNFG 
KIIJUVVYAKMMGIPIRKFICASNQN1IWTDFIKTG\HYDLRGKB 
R* AQTFFTVQ* I FLPNLSNLERHLHLMANKDGQLMTEIjFNRLES 
QHHFQIEKALVEKIX)QDFVADWCSEGECI^INSTYNTSGYIIjD 

phtavakwadrvqdktcp vi i sstah ys kfap aimqalki ke i 
nbtsssqlylixssynalpplheaijlertkoxjekmeyqvcaadmn 

vlkshveqlvqnqfi 


5851 


3120 


1802 


RCYLQFLALLLTSTSARAAAAIAAAEEPAGSPSVMTRAGDHNRQ 
RGCCGS LAD Y1»T S AKFLL YLGHS LS TW G DRMWH FA VSVF L»VE1i y 
GNSLLLTAVYGIjVVAGSVLVIiGAI IGDWVDKNARI*KVAQTS LW 
QNVSVILCGI IIiMMVFLHKHELLTMYHGWVLTSCYILI ITIANI 
ANLAS TATAI T I QRDWI WVAGEDRSKLANMNATI RRI DQ I/TNI 
LAPMAVGQIMTFGSPVIGCGFISGWNLVSMOTEYVIiLWKVYQKT 
PAIAVKAGL KEEETEIiKQliNLHKDTEP KPLEGTHIiMGVKDS NIH 
EIjEHEQEPTCASQMAEPFRTFRDGWVSYYNQPVF/IjGWHGSCFP 
LYDCPGIj* LHHHRVRIiHSGTEWFHPQYFDGS ISYNWNNGNCS FY 
LATSKMWFGSDRSDLRIGTAFLFDIiVCDLCIHAWKPPGLVRFSF 


5852 


1 


422 


KTTFPSSLCPLRQLPEVRGYSGQPLTDPLISLCRSHKCRGKGWG 
SSS YPSLPALLRARSAPGHCTHRSCGPEWRIDS I SRLEMQGARR 
SGWAQAQPT I LIXVPRLRKSLPSI WG/SLMGF FITSGPG/ WFRQ 
YYFFI SGRH+ VLFTESDFYYVAMDFGGHGLSSHYS PGVPYY1.QT 
FVSB IRRVVAGKKQSVYFRRCGGCSRAP PIiITGGGVGSRKQRWP 
ESGAWAIiAPGLPAIHGRSWES 


5853 


223 


1346 


RJaLGLSRVKGLHGPAASAWISDPETRGDPGGPWGMWRGSDLRPR 
PVSIiTGljTLVCK*AAG^POV\HSVKLGFGIiGG\PCLI*\FPIFRP 
IiLLH PRRPRLH PGTRG VAVEPHALR VVHVAHGE EAG I RAAG PGH 
GGVE I PQG / VGSI^3ARRGLRPSRPSSRHRNRVPAPPPGRPIiATP 
HRRRFPPDPALTCPGIiGQDQGPREQQKQGSGRHDTIIiGDWGESE 
SRWVRGNFRTGTAATLIGFS RNPTLNGS ENWGSLVS I QEEGPDT 
GWEREKRNPAEMGOTQRV^PIHTPPIiGPEILRAMPEALRAMPE 
ALGLRPDPArSVPS AL»S /QTF / PESWPRSCLRNQGETLGMG PVP 
I^SLCITESPSQNWTPCLLkLTCPRGLF 


5854 


86 


938 


KGRNTAPEKKGAALNNRENASS+NGY/SRWKQDIRRIENHI IQE 
LXHLCAMI KRVLLERI^mOCLRELTEGRTLDWPQNRITEVSAK 
RQI VTE YREKGKRN* EEKKRDLEGRSRRYNLCI IG I PETEDRAS 
GAET I KDLLE /ENFPEL KNELDLQMEKAHR I PUCFNEKKAASRH 
I RVTFL / KFQRRN I LQASSQRKQVTYKGAKVRLTS DFS PAILNA 
RRQW/N/ PISRVIiRENNFEPRI IYSAKLSFLYKGNWKTFLDIQG 
LG KYINQELS LK ILLKDLDQLTENIiN 


5855 


536 


2391 


LRSYGCKAPSRISHLHK\FLFliLLPSLLMGYSESPPPITDSWAP 
FISLTHHVLSQSQS PLSSNCWI CLSTHTQ* FTALPADLIjTWTQS 
NVSLHISYLAIPFLADSFUCPV/I.* PGNSAKHLSFKLS SLSMVS 
GRAVAI^LIASGLTSIQTNTASSKPPIWGY\LSTOTSFISPPP 
LCLS RTYPN PAHATMVGQVPQS LCGLI FTL/RTPCRPS I LHPNY 
KI ISTSAWQKVL^FSGSPTIHTSIjHLTTGSSFLSFHPIPGFPAA 
NSAI* YVSS ItKGP PGKNVTI PS PVTGT*QP PHRGSN/RLTVDKDN 
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SEC 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corre spondi n g 
to first 
amino acid 
re s j. uuc? ui 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 

ami T^f\ t 

o.HliJJt-> dti-U 

sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H^Histidine, I=Isoleucine, K=Lysine, 
I>= Leucine, M=MetJiionine, KUAsparagine, 
P=Proline, Q-Glutamine, R»Arginine, 
S=Serine, T=Threonine f V=Valine, 
w - iTypcopnan , a — iyros me , A=unJcriown , w =stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








FFLSPKPNSLHQLPSQ\TPyQAI>TGAALAGSYPIWENSNTI*SWLf 
PTFTYNFCI^STPSLFFX.ODTN+YLCLPANWSGTCTLVFQAPTIN 
ILPPNQTILISVEASISSSPIRNKHAIJlLITLIjTGLGlTAAItGT 
G LAG ITTSI TSY QTL FTTLSNTVEDMHTS I TS LQRQLDFLVGVI 
I^NWRVT^LLTTEKGGTCIYI^F^CCFCWE^GIVHIAVRRIiHD 

T> 7\ 7\ C*T . *■ U ^\T7 T"» MT J/ - ^ I*"" C* r T Trl T Ti T«>\ T7\ T"> TTT C Tt T T Ut pt T T mtT/"*T» 

KAAr.Jj w Wy V/\L/o W nQbb o LlLlK V* X t* W VAPr IjoirijirxjrialjIjMIGP 

CI FNIiVSRFlSQRLNCFIQASMQKHIDN IFHLCHV* YQSLRGNH 
SEAPEFRP 


5856 


173 


1137 


PWLHGLGLSAVFLFYL+ / YVTFHLYGGI ILLLIjIFISIAGILYK 
FQDVLLYFPEQPSSSRLYVPMPTGI PHENI FIRTKDGIRLNUI* 
IRYTGDNSPYSPTI I YFHGNAGNlGHRLPNALDP^Via.KVNLLI. 
VDYRG YGKSEGEAS EEGbYLDSEAVTiDYVMTS PDLDKTKI YI*SG 
RSLG\GAAAIHLASDNSHRISAIMVENTFLiS I PHMASTLFSFFP 
MR YLPLWCYKNKFIjS YRKI SQCRMPS LF I SGLS dqli p pvmkkq 
LYE1»S PSRTKRLAI FPDGTHNDTWQCQGYFTALEQFI KEWKSH 
SPEEMAKTSSNVTI I 


5857 


1597 


563 


KLIGKVLVLS WADAMAAFAVEPQGPALGSEPMMLGS pts PKPG 
VNAGFLPG FLMGDLPAPVTPQFRS I SGPS VGVMEMRS PKLAGGS 
PPQP WPAHKDXSGAPPVRS I YDD I SSPGLGSTP LTSRRQPNI S 
VMQS PLVGVTS TPGTGQSM FS PAS I GQPRKTT1*S PAQLDP FYTQ 
GDSLTSEDH\ LDDS WGD CI WGFIiXAS A\S Y I LL \ QFAQYGGIS * \ 
NMWMSNTGNWMHIRYQSK1G^VRKA1»SKDGRIFGESIKIGVKPCI ! 
DKSVME SSDRCALSS P SLAFTP P I KTLGTPTQPG STPRI STMRP 
LATAYKASTSDYQVI SDRQTPKKDE SI*VSKAME YMFGW 


5858 


355 


1419 


PPHQPAAASTSXHQQQQPPPPPQDSSKPWAQGPGPAPGVGSAP 
PASSSAPPAIPPTSGAPPGSGPGPTPTPPPAVTSAPPGAPPPTP 
PSSGVPTTPPQAGGPPPPPAAVPGPGPGPKQGPGPGGPKGGKMP 
GGPKPGGGPGLSTPGGHPKPPHRGGGEPRGGRQHHPPYHQQHHQ 
GPPPGGPGGRSEEKISGPRRGFKANI^St.LRRPGEKTYTQRCRFC 
IiIiGIYI»IiISRRlWSRJU^FAKIWF^QEKFI^TiaWCDSEFIKI>ESR 
ALA*NCPKFELG * YTP+GGRQLPSSLFPTHACLPLSCSVI FS PF 
MFPQ*NCJWGRKPFRPNI*GPHLKGAVCNRWDDPWEGPTGKGHCIiN 
FAS 


5859 


307 


1503 


GGSSARPRASSRRMLSRKXTKNEVSKPAEVQGKYVKKETSPLLR 
NLMPS FIRHGPTI PRRTDI CkPDSSPNAFSTSGDGWSRNQS FL 
RTPIQRTPHEIHRR3SNRI.SAPSYLARSIADVPREYGSSQSFVT 
EVSFAVENGDSGSRYTYSDNFFDGQRKRPLGDRAHEDYRYYEYN 
HDLFQRMPQNQGRHASGIGRVAATSLGNIiTNHGSEDI»PI,PPGWS 
VDWTMRGRKY Y IDHNTNTTHWSHPLBREGLPPGWERVESSEFGT 
YYVDHTNKKAQ Y\RHPCAPTCTS V* ST'TS CH I /AS / RQQTERNQ 
SLLVPANPYHTAEI PDWIiQVYARAPVKYDHIIJCWELFQLADIiDT 
YQGMLKl a LFMKELEQI\n<hfYEAYRQAIjLTEbENRKQRQQWYAQQ 
HGKNF 


5860 


2956 


1270 


TIRVEEFPLCPGGGKAQI^SASI^iySAGLIaLQPPTPPPI^IJJaLFP | 
LLLFSRLCGALAGPI IVEPHVTAVWGKNVSLKCI*! EVNETITQI 

piiTPtfTtf/^irpertm n>tnrunnAVPC , oUnr't?vrv , 0\n.I?VMVCT xtfs "A T'T 
SWEKJLHvJi^yyTVAVHiaPv ^ ^ i^^ttv LiV r>JS ic>LiPiU+i.i. X 

TLHNIG FSDSGKYI CKAVTFPIjGNAQSS TTVTVI*VEPTVSI»I KG 

PDSLIDGGNETVAAI CI AATG KPVAHIDWEGDLGEMES TTTS FP 

NETATI I SQ YKIiFPTRFARGRR ITCVVKHPALEKD IRYSFI LDI 

^YAP1^SVTGVT«^IWFV^;I^GVNTJKC!^1ADA1TPPPFKSVWSRLK 

QWPTCLI^DNTLHFVHPLTFN YSGVYI CKVT\NSPGS KEVTQK 

VHPTFQDPSLPTYP PLPALQFQWAS PSTA* TSRD \ LATE P* KIA 

PSPLSTL\ATI KGWTQLPTI IA* CSGVGALF I V\ LVKCFGLG I P 

CiTlRRRTFRGDYFAKNYIPPSDMQKESQIDVIiQQDELDPYPDSV 

KXENKKPVNNLIRKDYl^EEPEKTQWNKVENliNRFERPMDYYEDL 

KMGMKFVSDEJIYDENEDDLVSHVDGSVISRREWYV 


5861 


2051 


1305 


EVCACVQAFWLVASSGDDSQGGDKCX5CEVGSWVGSMRVVMARLL 
SEGEQGI PTACAAFAQQPAG / EPRRGLAGVGEGGPQCSWVNYRC 
TLEFLVSLLGTDLARGRGNSASGPTAPADS KQI»/ ML * DVHRRVI 
I»E* RMNSGSPAREaOAPSQRFCTNLSEGLRFG IS PSWREALYGCH 
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co rre sponding 

to first 

amino acid 

residue of 
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sequence 


Predicted end. 
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corre sponding 
to first 
amino acid 
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amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=I*eucine, M= Methionine, N=Asparagine, 
P=Proline, Q-Glutamine, R=Arginine, 
S=Serine / T= Threonine, V= Valine, 
W=Tryptophan, Y=Tyrosine , X= Unknown, *=Stop 
Codori, /^possible nucleotide deletion, 
\=posstble nucleotide insertion) 








A 


58 62 


1556 


483 


PPFQLIMGBI KVSPDYNWFRGTVPLKKI I VDDDDS KI WS L YDAG 
PRSIRCPLI FXP PVSGTADVPTRQILAIjTGWGYRVXAI*QYPVYW 
DKLE FCDG FRKLLDHLQLDKVKLFGAS LGG FIiAQ KFAE YTHKS P 
RVHSLILCNSFSDTSIFNQTWTANS FWLMPAFMLKKIVLGNFSS 
GPVDPKMAnAIDFMVDRLESL^SEIASRiTLNC^NSYVEPH^I 
RDIPVTIMDVFDQSALSTEAKE1EMYKLYPNARRAHLKTGG1TFPY 
LCRSAEVNLYVQIH1./R/RNSMEPNTRPL.THQVJSVPRS LRCRKA 
ALASARRSSSVSIAVNDEIjTRCVLV* SVASAPVSRPFPSGSSGS 
PVLTVSGK 


5863 


2714 

■ 


249 


PFPSRGSLPIJUVPREirrP1GPLMVLFCLL,FLYPGlADSAPSCPQN 
VNISGGTFTI>SHGWAPGSlJjTYSCPQGI*YPSPASRIiCKSSGQWQ 
TPGATRSLSKAVCKPVRCPAPVSFENGIYTPRLGSYPVGGNVSF 
ECBDGF I \LRGS PVRQCRPNGMWDGETAVCDNGAGHCPNPG I SL 
GP\VRTGFRFGHGDKVRYRCSSNLVLTGSSERECQGNGVWSGTE 

P I CRQP YS YDFPED VAPALGTS FSHMLGATNPTQKTKESLGRKI 

QIQRSGHLNIiYLLLDCSQSVSENDFLI FKESASLMVDRIFSFSI 

NVSVAI ITFASEPKVLMSVLNDNSRDMTEVISSLENANYKDHEN 

GTGlOT^AAliNSVYIjMMmQMRIiI^METMAW 

DGK\S HMGGS P KTAVDH IRE I LN INQKRNDYLDI YAIGVGKLDV 

DWRELNSLGSKKDGERHAFI LQDTKAI*HQVFEHMI*DVSKLTDTI 

CGVGNMSANASDQERTPWHVTIKPKSOETXCNRGALISDQWVIjT 

AAHCFRDGNDHSLWRVNVGDPKSQWGKEFLIEKAVISPGFDVFA 

KKNQG IL\EF YGD \ DI ALI*\ KIAQKVKM \ STHCQG PSCLP\ CTM 

\EANLGFIjRETFKGSTCR\DHENEI»/VI'/NKQSV\ pahf \vai*\n 

GSKI^HLTIiRMGVEWTSCCRGI^PKKKTMXFPNLTVDVREXVVT 

D\QFL\CS\GPQEDESP\CK*E\SGGA\VF1iEKRFR1jSAGGVWC 

SWGL\YNP\CT/3SA\DKNSPKKGPSVAKVPPPTR/DFHIN\LFP 

Q+SPWI*RQHPGGMS * IFXPLLANGHLS P FACPARI CRPLKFLPS 

EWATLRTL. 


5864 


173 


1013 


PLISVPQSLISLPQPLLCFPGGQEPSAPS PCLYSFLWACS FTMG 
KLPPS I PPSS PIiACVLKNIiKPIiQLTPDIjKPKCLI FFCNTAWPQY 
KLDNDSX* PENGTFE FS I XrQ VL»DNS CHKMGKWS EVPDVQAF F \ S 
HWSLPS1»CSQC/GL I PNLiSSFS PFCSFG/ PPPQVPS P /TES FFS 
MDSSDLP PSPQAAPRQAEPGPN3HIASAP PPYNPFI TSPFHTWS 
SLQFHSVTSPPPPAQQFTLKKVAGAKGZ VKVSAPFSLSQIR* RX» 
GSFSSNIKIQPSSWLIWQQP 


5865 


568 


1684 


CIiPGPRWGEGWRAGHT I VGC I F FKTAI j-SHFKGGMYLCVCMCTC 
LSVCVCVQVGSWICV/CVSMCACVSLCTC\IC31CISMYTREHAC 
ACTRV* VYMCMS / VCTCVSTC IDVRVCAHV CVYMCLCLG YA* AC 
TCV*MCVCMHEHVCWC/VCACSCVIiIi/ CRGHICM/ MCMS AY I CI 
/CVYVCV"LCVmCMRMSTCVWLVYG*ACTCVWMHM/CSCT 
VHVCCMSMHACBCLCVYIJI1CGCAGTRRWWAGSARGSRSCSRLP 
CWAPGPGLSLPGP S CP SVEQGLGGGPGQLQGRSGEARLGEHRG W 
GSPAAVCSRNCTVS P RRGADCF3APDVP KQPPGWGRAS FEERGC 
GGRGWVCAP PLKGPQCCCFS I KPEUCAKKKK 


5866 


98 


3197 

» 


ARPEVPAPPAWLSRRGAAKMGDKKDDKDS PKKNKGKERRDLDDI* 
KKEVAMTEHEOviSVEEVC^KYNTDCV(^LTHSKAQEII^K3P^ 
LTPP PTT P EWVKFCRQLFGGFS I LLW IGAI LCFXAYGI QAGTED 
DPSGDNLYLGIVLAAWI ITGCFSYYQEAKSSKIMESFKNMVPQ 
QALVIREGEKMQVNAEEVVVGDLVEI KGG DRVPADiiK I X bflHGC 
KVDNS S LTGESE PQTRS PDCTHE\NPliKTRN I TFFSNNFVEGTA 
RGVWATGDRTVMGRIATlASGLEVGKTPlAIEIEHFIQIilTGV 
AVFDGVSFFI h$Kl LGYTWLEAVT FLIGI I VANVPEGLIATVTV 
CliTLTAKRMAKKN CLVKNLEAVE TLGSTS T I CSDKTGTLTQNRM 
TVAHMWFDNQIHEADTTEDQSGTS FDKSS HTWVALF* H / LLG FC 
NRPVFKGGODNI PVLKRD VAGINAS 

RNKKVAEIPFNSTNKYQLSIHETEDPNDNRYLLVMKGAPERILD 
RCSTILLQGlO^PliDEEMKEAFQl^YLEIjGGliGERVIjGFCHYYli 
PEEQFPKGFAFDCDDVNFTTDNLCFVG1»MSMIGPPRAAVPDAVG 
KCRSAGIKVIMVTGDHPITAKAXAKGVGIIFEGNBTVEDIAARI* i 
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sequence 


Amino acid segment containing signal peptide 
{A=Alanine, C=Cysteine, D^Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine , G=Glycine, 
R=Histidine, I=Isoleucine, K= Lysine, 
J .= Leucine, M=Me thionine , N-Asparagine, 
P= Proline , Q«=Glutamine, R^Arginine, 
S=Serine, T=Threonine, V»Valine, 
W=Tryp t ophan , Y=Tyrosine, X=Unknovm, *=Stop 
Codon, /^possible nucleotide deletion, 
\=spossible nucleotide insertion) 








NI PVSQVN PRDAKACVIHGTDUCDPTSEQIDEILQNHTEI VPAR ! 
TSPQQKLI I VEG CQRQGAIVAVTGDGVNDS PALKKAD I GVAMG I 
AGS DVS KQAADM I LLDDNFAS I VTGVEEGRLI FDNLKKS IAYTL 
TSNIPEITPPLLFIMANIPLPLGTITILCIDLGTDMVPAISLAY 
EAAESD I MKRQPRNP RTDKLVWERLI SMAYGQIGMI QALGG FFS 
YFVI LAENG FLPGNLVGIRLNWDDRTVNDLEDSYGQQVJTYEQRK 
VVEFTCHTAFFVSIVVVQWADLIICKTRRNSVFQQGMKNKILIF 
GLF3ETALAAFLSYCPGMDVALRMYPLK^SWWFCAFPYSFLI FV 
YDE I RKL I LRRNPGGWVEKET YY 


5867 


3 


1485 


L PGRRARG GRGLG WP PAQALDG S RMGKAKV PAS KRAP S S ? VAKP 
GPVKTLTRKKNKKKKRFWKSKAREVSKKPASGPGAVVRPPKAPE 
DFSQNWKALQEWLLKQKSQAP E KPLVISQMGS KKKPKI IQQNKK 
ETS PQ VKGEEMPAGKDQEASRGS VPSGS KMDRRAP VP RTKASGT 
EHNKKGTKERTNGDIVPERGD I EHKKRKAK\GQPQPHPPR/ IDI 
WFDDVDPADIEAAIGPEAAKIARKQLGQSEGSVSLSLVKEQAFG 
GLTRALAIJ>CEMVGVGPXGEESMAARVS I VNQYGKCVYDKYVKP 
TE P VTDYRTAVSGIRP ENLKQG EELE WQKEVAEMLKGRI LVGH 
ALHNDLKVLFLDHPKKK I RDTQ KYKP FKSQVKSGRPS LRLL SEK 
ILGLQVQQAEHCS IQDAQAAMRL YVNVKKEWES MARDRRP LLTA 
PDHCSDDA * QS C PAAAAAP LQR QCDQ SQGQI TS PQSGNSGETFS 
ESWQRGVAWCY 


5868 


2122 


833 


LTAGAS HTQDAS Q STS A KY PAAAQNL / CVTNAMREDLADI WYIR 
AVTVYDKPASFFKETPIJ5LQHRLFMKIi3SMHSPFRARSEPEDPV 
TERSAFTERDAGSGLVTRLRKRPALLVSSTSWTEDEDFS IL.LAA 
LESRV* T\MTLDGHNLPSLVCVITGKGPLRE YYSRLIHQKHFQH 
IQVCTPVa^AEDYPLLI^SAJDLGVClJiTSSSGLDLPMKVVDMFG 
CCLPVCAVNFKCLHELVKHEENGLVFEDSEELAAQLQMLFSNFP 
DPAG KLNQFRIu^IjRESQQliRWDES W VQTVLPLVMDT 


5869 


2122 


833 


LTAGASHTQDASQSTS AKTPAAAQNL / C VTNAMREDLJUO I W Y I R 
AVTVYD KP AS FFKETPLDLQHRL FMKLGSMHS P FRARS E P E D PV 
TERSAFTERDAGSGLVTRLRERPALLVSSTSVrrEDEDFS ILLAA 
I*ESRV+T\MT1JX3JNLPSLVCVITGKGPLRBYYSRLIHQKHFQH 
IQ VCT P WLEAEDYPLLIiG SADIiGjVCLHTSSSGLDLPMXVVDM FG 
CCLPVCAVNFKCLHELVKHEENGLVFEDSF^LAAQLQMLFSNFP 
DPAGKLNQFRKNLRESQQLRWDESWVQTVLPLVMDT 


5870 


2122 

* 


833 


LXAGASHTQDASQSTSAKYPAAAQNL/ CVTNAMREDLADI W YI R 
AVTVYDKPASFFTCETPLDIiQHRLFMKLGSMHSPFRARSEPEDPV 
TERSAPTERDAGSGLVTRLRERPALLVSSTSWTEDEDFSI LLAA 
LESRV* T\MTLDGHNLPSLVCVTTGKGPLREYYSRLXHQKHFQH 
IQVCTPWLEAEDYPLLLGS ADLG VCXiHTS SSGLDLPMKWDMFG 
CCXPVCAVNFKCLHELVKHEENGLVFEDSEELAAQLQMLFSNFP 
DPAGKLNQFRKNLRESQQLRWDESWVQTVLPLVMDT 


5871 


3 


3465 

• 


FFFCRPLRLYSKTTGDRSAMAGAAGLTAEVSWKVLERRARTKRS 
VLKLL* LSLRRL* LE PTI *NGLLT*CSRLSVFRFLKV\GSVYEP 
LKSINLPRPDl^TLWDKXDHYYRIVKSTLLLYQSPTTGLFPTKT 
CGGDQKAKrQDSLYCAAGAWALAIAYRRIDDDKGRTHELEHSAI 
KCMRGILYCYMRQADKVQQFKQDPRPTTCLHSVFflVHTGDELLS 
YEEYGHLQINAVSLYLLYLVEMISSGLOI IYNTDEVSFIQNLVF 
CV\ERVYRVP\DFG\ VWGKREGKYY* /SGSTELHS SS VGLG KRQ 
L* KQFNGFNLFGNQGCSWSVI FVDLDATOJRNRQTLCSLLPRESR 
SHNTDAALLPCISYPAFALDDEVLFSQTLDKVVRKLKGKYGFKR 
FLRDG YRTSLEDPNRCYYKPAE I KLFDG I ECEFPI FFLYMM IDG 
VTRGWPKQVQEYQDIJjTPVLHHTTEGYPWPKYYYVPADFVEYE 
KNNPGSQKRFPSNCGRDGKLFLWGQALYI IAKLLADELISPKDI 
D PVQR YVP LKD QRNVS MRFS NQG PLENDLWHVAL I AE SQRLQV 
FLNTYG I OTQTPO^VEP I QI WPQQELVKAYLQLG INE KLGLSGR 
PDRP I GCLGTS KI YRILGKTWC Y PI I FDLS DFYMSQDVFLLI D 
DIKNALQFIKQYWKMHGRPLFLVLIREDNIRGSRFNP ILDMLAA 
LKKOI IGGVKVHVDRLQTLISGAVVEQLDFLRISDTEELPEFKS 
FEELEPPKHSKVKRQSSTPSAPELGQQPDVNISEWKDKPTHEIL 
QKMTOCSClASQAIIJ^IIiKREGPNFITKEGTVSDHIERVYRR 
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5872 



5873 



5874 



Predicted 
beginning 
nucleotide ' 
location 
corre spending 
to first 
amino acid 
residue of 
amino acid 
sequence 



68 



2240 



2 



5875 



296 



Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



665 



S06 



Amino acid segment containing signal peptide 
{A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H-Histidine, I=Isoleucine, K= Lysine , 
L= Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glutamine, R«Arginine, 
S=Serine, T-Threonine, V=Valine, 
W-Tryptophan, Y=Tyrosine, XsUnknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 



AGSQKIjWSWRRAASIJjS KWDSIiAPS ITNVLVQGKQVTLGAFG 
EEEBVISNP1*SPRVIQNIIYYKCNTHDEREAVIQQEI,VIHIGWI 
ISNNPBLFSGTLKIRIGWIIHAMEYELQIRGGDKPALDLYQLSP 
SEVKQLXlLD I LQPQQNGRCWLNRRQ IDGSLNRTPTGFYDRVWQI 
LERTPNGI 1 VAGKHLPQQ PTLS DMTMYEMNFS LLVEDTLGNI DQ 
PQYRQIWELLMWS IVLERNPELEFQDKVDLDRLVKEAFNEFQ 
KDQSRLKE1 EKQDDMTS FYNTP PLGKRGTCS YLTKAVMNLLLEG 
EVKPNNDDP CLi I S 



VQG YMYRFV I KI NSC YSEKTS I CRHRCCPELPATQPW PTPTVF F 
N I AI DS ESLG C I \ S F KI> FADKV/ PKRVTCKNFVLLNTGE KVLGDK 
GPCFYRI IPG\LCQGGDFTHHNGTGGKSLYSKE FDDENFI / LKH 
TAPGVLSTANAGPTTNGSQFFI CTAKTEDG *QHWFGKVKDGMS 
IVEA1»ERSGSRNGKTSKKITAANCGQL» 



3387 



1848 



RRPP EGGSGGGRRTRARMP LPW S IiAL»PLlil»S WVAGG FGNAAS AR 
HHGLIiASARQPGVCHYGTKLACCYGWRRNSKGVCEATCEPGCKF 
GECVGPNKCRCFPGYTGKTCSQDVNECGMKPRPCQHRCVNTHGS 
YKC FCLSGHMLMPDATCVNSRTCAM INCQYS CEDTEEGPQCLCP 
SSGLRIJVPNGRXJCLDIDECASGKVICPYNRRCVNTFGSYYCKCH 
IGFEl^YISGRYDCIDINECTMDSHTCSHHANCFNTQGSFKCKC 
KQGYKGNGLRCSAI PENS VXEVLRAPGTIKDRI KKLIiAHKNSMK 
KXAKI KNVTPE PTRTPTP KVNLQP FNYEEIVS RGGNSHGG \ KKG 
NEE KM KEGLEDE KREEKAL KD *HRRER P FRG \ D VF FP KVNEAG E 
FGLI IA VQRKALTS KLEHKADLNI SVDCS FNHG \ I CDW \ KQDR \ 
EDDFDW\NPADR\DNAI\GFY\MAVPGLWQGHK\KDIGRLKLI,L 
PDLQPQSNFCLLFDYPJIlAGDKVGKLRVFVKNSNNALAWEKTTSE 
DEKKKTG KI QLYQGTDATKS I IFEAERGKGKTGEIAVDGVLLVS 
GLCPDSLLSVDD 



ACPRLARRRRRVRSljRi^RRGWLRARWSRGQNNMAARRITQETFD 
AVIK3EKAKRYHMDASGEAVSETIiQFKAQDLI*RAVPRSRAEMYDD 
VHSDGRYSLSGSVAHSRDAGRESLRSDVFSGPSFRSSNPSISDD 
SYFRKECGRDI*EFSHSNSRDQVTGHRKI«GHFRSQDWKFALRGSW 
EQDFGHPVSQES S WSQEYSFGPSAVLGDFGSSRL I EKECLEKE \ 
SRDYDVDHSG\EA\D SVLRGS \SQVQA\ RGRALN I VDQEGS LLG 

. KGETQGLLTAKGGVGKLVTLRNVSTKKI PTVNRI TPKTQGTNQI 
QKNTPSPDVTLGTNPGTEDIQFPIQKI PLGLDI1KNI.RI1PRRKMS 
FDIIDKSDVFSRFGI EI IKWAGFHTI KDDI KFSQIiFQTLFEIjET 
ETCAKMLAS FKCSIjKPEMRDFCFFTIKFIjKHSALKTPRVDNEFL- 
NMLLDKGAVKTKNCFFEI I KP FDKY IMRLQDRLLKS VTPLLMAC 
NAYELSVKMKTLSNPLDLALALETTNS LCRKSLALLGGTFSLAS 
S FRQEK I It * AVGLQDIAPS PAAFPNFEDSTL.FGREY IDHLKAWL 
VSSGCPLQVKKAEPEPMREBEKMIPPTKPEIQAKAPSSLSDAVP 
QRADHRWGTIDQLVKRV I EGS LS PKE RTLLKEDPAYWFLSDEN 
SLEYKYYKIjaiABMQRMSENIiRGAIXJKPTSADCAVI^ 
RNLKKKLLP\WQRRGLI>RA(^\I^G\WKARRA\TTGTQTLLFLR 

. APGLKHHGRQAPGLS \QAKPSI*PDRND\AAKD\CPI*DPV\GPS P 
QDPSLEASGPSPKPAGVDISEAPQTSS PCPSADIDMKDNGRTAE 
K1ARFVAQVG\PEIEQP\SI\ENSTDNPDLWFL\HDQNSS\AFK 
FY\RKKVFELCPSICFTSSPHNI*\HTGGGDTT\GSQESPVDLME 
GEAEFEDEPPPREAELESPEVMPEEEDEDDEDGGEEAPA\PGRG 
GPSLEGSTPADGLPGEA\AEDDL/ALGAPAI*FTGLliQVTCFPFG 
RGFSSKSLKVGMI PAP KR VCL I QE P KVHEPVRI AYDRPRGRPKS 
KKKKPKDI>DFAC^KI*\TDK\NLGFQ\MLQKMGWKEGHGLGSLGK 
G IR\ SRSACTQQAAWGGSGWGLS PSTCSLPLGS FTAKMAYSWQL 
IFVF 



LAALGGLPLWRLSRRG FREYLLGLSAPSALGGAMRSVS YVQRVA 
I^FSG^LFPHAICMDVDNDITiNELVVGDTSGKVSVYKNDDSRP 
WLTCS CQGMLTCVGVGDVCNKG KNLLVAVS AEGWFHLFDL.T PAK 
VLDASGHHETIj I GEEQRP VFKQH I PANTKVMLI SDIDGDGCREL 
VVGYTDRVVRAFRWEEIjGEGPBHLTGQLVSIjKXWMLEGQVDSLS 

VTLG PLGIiP ELMVSQ PGCAYAI LLCTWKKDTGS PPAS EG PTDGS 
/SGDPSCPRRGAAPPIWPYPQQECLHSPNWQHQT\SHGTESSGS 
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SEQ 
! ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corre spondi ng 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 1 
(A= Alanine, C-Cysteine, D=Aspartic Acid, B= | 
Glutamic Acid, F= Phenyl alanine, G^Glycine, 
H=Histidine, I=lsoleucine, K=Lysine, \ 
— **^viv» cs , ntz i-ti i uniuc , i»=it5pctrcigine , j 
P=Proline, Q=Glutamine, R=Arginine, I 
S=Serine, T=Threonine , V«= Valine, J 
W=Tryptophan, Y=>Tyrosine, X= Unknown, *=Stop | 
Codon, /-possible nucleotide deletion, 
\ -possible nucleotide insertion) 1 








GLFALCl iAJGTbKIJ^EEMEEADKIjLWSVQVDHQLFAI^KI^DVTG 
NGHEEVVACANDGOTY 1 1 DHhTRTWRFOVDEMT'Ra PfAHT .Ynr»Tr 1 

KGRNSPCLVTVTFT3QKIYVYWEVQLERMESTNLVKLLET:<P\ST I 
TACCP^WAWILTTSL+LVPCPTKRSTIOTSHHSVLPQASRIPPS 
WTCLIAGEGFF*TPTLPPKGVFGSHCAAAGSITKQ | 


5B76 


1122 


224 


HLPLGVPS KVAGAAAME PQEERETQVAAWLKKI FGDHP I ?Q YE V 
KPRTTE I LHHLS ERNR VRDRD VYLVTF. DL KQKASE YES EAKYLQ J 
uu jjtico vim r o frtN Lio o 1 ooa i loWAijViJia AVALfcTI KDTS LiAS F I P | 
AVNDLTSDLFRTKSKS EE I KIEl^KLE KNLTATt»VLEKCXOEDV 
KKAEIJrLSTER\AXVDNRRQNM\DFLKAKSEEFRFGIQAAGEQL 
SARGQ\DAFSVP IQSLVALIRBNWPRLKQQTI PLK\KKL£S VXD 
IMP\NPSHCSK*RIEEAK\REIA\SIEAELTRRVS\MMEIi j 


5877 


2030 


1907 


GTLGKMAASSSGEKEKERI^RRT^VA^NSTPKRT.T.saT.wnT^pri/ j 
LSRELI EMLA ISRNQ KUjQAGEENQVT iR T J iI HRDGEFQELM KLA 
LNQGKIHHEMQVLEKEVEKRDSDICX3LQKQLKEAEQIIJVTAVYQ 
AKEKLKSIBKARKGAI SSEEI I KYAHRI SASKAVCAPLTWVPGD 
^KRPYPTDLEMRSGLLGQMNNPSTNGVN^ 

CPCSTVS/NGSQMTCR*INIIIiILQKSVCEI» | 


5878 


950 


2113 


GLWKCMQLQGPHTHRVQP * PTPRQQGPQ \VPVAVIAGNRPNYLY j 
RMIJ^LI^QGVSPQMITVFIDGYYEEPMDVVAI/FGLRGIOHTP 
I S I KNARVS QHYKAS LTAT FNX.FP EAKFAVVLE EDLiD I AVD F FS 
FLSQSIHLLEEDDSL.YCISAWNDQGYEHTAEDPALLYRVETMPG 
LGWVTJUISIjYKEELEPKWPTPEKLWDWDMWMRMPEQRRGRECI I 
PDVSRSYHFG1VGLNMNGYFHEAYFKXKJCFNTVPGVQLRNVDSI* 
KKEAYEVEVHRIiSEAEVLDHSKNPCEDSFLPDrEGHTYVAPlR j 
ME KDDD FTTVH'QLAKCI^I WDIJD VRGNHRGLV7RIjFRKKNHFIiVV 
GVPASPYSVKKPPSVTPIFLiEPPPKEEGAPGAPEQT ] 


5879 


3 


981 


RliTEAAAAGSGSRAAGWAGS PPTIjIiPIjS PTS PRCAATMAS S D ED 
GTNGGASEAGEDREAPGKPJRRLiGFIiATAWLTFYD I AMTAGW LjVI> 
AXAMVRFYMEKGTHRGLYKS IQKTLKFFQTFALLErVHCLIG IV 
PTSVI VTGVQVSSR IFM VWLITHS I KPIQNEE S WLFLVAWTVT 
EITRYS F YTFSI»LDHLPYFI KWARYNFFI ILYPVGVAGELLT I Y 
AALPHVKKTGMFS IRLPNKYNVS FDYY YFLLITMASYI PLFPQLi 
YFHMLRQRRKVLHG\G*L* KRMIK* SLQTRCFFQNNQDYLS PS F 

WKTTfKTTTfVT rTT CtJ T \ n.l OT rr-r I 
WiNlUN JS.yjjL.tiXoVf x VNr i_rX\. J. 1 


5880 


113 8 


1324 


S IiWCIiVAGGLGItG PS SQNPLQRAG IliARPREARGT FS7U/TACSA 
SVTSKGKSSSGMWPSAASDRDS PVPIiRPPGPVQIiPSGTGWVLSD 1 
♦KKKRGRCSS/WLSQPQHEREKEVVLLRP^MAEGERARAASDVL j 

rt,o JU/-UN C i. n^J^ftrC X JLf X A 4 >LtliyiC-v^.rt.Lt/Vr>.^ .1 il >&K\JriJ\\£rCr4 V bLKo tr I 

DQSEHTDGHTS VQ S V I EKLQEEl^LLKQKVTHVEDLNAKWQRYN 1 
ASRDE YVRGLHAQLRGLQI PHEPEXMPJOSISRLNRQLEEKINDC | 
AEVKQELAASRTARDAALERVQMIjEQQI LAYKDDFMS eradrer 
AQSR I Q ELE EKVASL LHQVS WRQ DS REP DAGR IHAGS KTAKY LA 
ADAI^LMVPGGWRPGTGSQQPEPPAEGGHPGAAQRGQGDLQCPH 
CLQCFSDEQGEELLRHVAECCQ 


5881 


26 


441 


GGI HPSPTEAPRAQHIiTMDCTWRILFIiVAAATGTHAQVQLLQSG | 
S E VKKPGASVMVS CYVSG YTLTKLSMHX'TVRQAPG KG LB * MGP FD 
LQDVETIYPQKFQGRVSMTEETSTETTQ/ AYLEI>SSIi!iSEDTAV 
HHCATDTV ) 


5882 


2407 


2216 


RTRRD*QLPEAGGPGI^EPIiQLGEIJ>ITSDEFILDEVDG\VDLR 1 
HYS KQVKL ELQQ I EQ KS I PJD Y I QES EN IAS LHNQ 1 TACDAV LER 
MEQMIXSAFQSDLSSISSEIRTLQBQSGAMNIPJJlimQAVPJSKIiG 
EL)VDGI*WPS ALVTAI LELAP VTEPRFIiEQl^ELDAKAAAVREQE 
ARGTAACADVRGVLDRLRVKAVTKIREFILQKIYSFRKPMTNYQ 1 
I PQTAIJjK YRFFYQFLI/3NERATAKE 1 RDE YVETLS KI YLS YYR 
S Yl^RJjMKVQYEEVAEKDDLMGVEDTAJCKGFFSKPSI^SRNTI F 
TLGTRGS V I S PTE LEAP I LVPHTAQRGEQR YPFEALFRSQH YAi j 
LDNS CRE Y LFI CEF FWSGPAAHDLFHAVMGRTLS MTLKHLDS Y 1 
LAI^CYIXAIAVFLCIHIVLRFRNIAAKRDVPAI>DR^ | 
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NO; 



5883 



S884 



Predicted ~~~~ 
beginning 
nucleotide 
location 
correspondi ng 
to first 
amino acid 
residue of 
amino acid 
sequence 



Predicted end 
nucleotide 
location 
corre spending 
to first 
amino' acid 
residue of 
amino acid 
sequence 



1374 



Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P= Proline, Q=Glutamine , R~Arginine, 
S=Serine, T=Threonine, V= Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /«=possible nucleotide deletion, 
V=possible nucleotide insertion ) 
PRFELI IiEWNVQSVRSTDPQRLGGLDTRPHYITRRYAEPSSALV 
SIKQTI PNERTMQLLGQLQVEVENFV1»RVAAE FSSRKEQLVFLI 
NNYDMMLGVLM\E* ERAADDSKEVES FQQLLNARTQEFI EELLS 

PPFGGLVAFVKEAEALIBRGQAERLRGEEARVTQIjIRGFGSSWK 
SSVESLSQDVMRSFTNFRNGTSI I QGALTQLI Q\ I.YHRFHRV\I, 
SQPQLRAIiPARAEIilNIHHLMVELKKHKPNF 



4261 



2522 



EFPGRRFRAVMEAGAGAGAGAAGWSCPGPGPTVTTLGSYEASEG 
CERKKGQRWGSLERRGMQAMEGEVLiPALYBEEEEEEEEEEBVE 
EEEEQVQKGGSVGSLSVNKHRGLSLTETEIuEELRAQVLQLYAEL 
E E TREl^AGQHEDDS L ELQG I.L»EDERIiAS AQQ AEVFTKQ I QQ LQG 
ELRSLREE X S LLBH E KESE LKE I EQEUTLAQAE I QSLRQAAEDS 
ATEHESDIASIjQEDIiCRMQNEiEDMERIRGDYEME IASLRAEME 
MKSSEPSGSU3LSDYSGI^EEU5ELRERYHFLNEEYRALQESNS 
SLTGOLADt>ESERTQRATERWI/QSO/rLSMTSAESO/rSEMDFLEP 
DPE^QlilJiQQIjRDAEEQtWGMKNKCQEIjCCELEEIiQHHRQVSEE 
EQRRIiQRELKCAQNEVLRFQTSHS\SPSHPI*PPIPPSSPCLI.*A 
LWISALLWCWWAETSS 



GVliARASARLRVPLTGVRACAEPEVGAE PAKVAGAAEPDEDGGR 
S RLRDCG D Y T PS ERLG PKGAMLW FQGAI PAAI ATAKRSGAVF VV 
FVAGDDEQSTQMAASWEDDIOrrEASSKSFVAIKIDTKSEACLQF 
SQ I YP WCVPSS FFIGDSGI PI^VTAGS VS ADELVTRIHKVRQM 
HLLKSETSVANGSQSESSVSTPSASFEPNNTCENSQSRNAELCE 
I PSTSDTKSDTATGGESAGHATSSQEPSGCSDQRPAEDLNIRVE 
RLTKKT.EERREEKRKEEEQREIKKEIERRKTGKEMLDYKRKQEE 
ELTKRMLEERNREKAEDRAARERI KQQIALDRAERAARFAKTKE 
EVEAAKAAA1jIjAKQAEMEVKRESYARERSTVARIQFRI»PDGSSF 
TNQFPSDAPLEEARQFAAQTVGNTYGNFSLATMFPRREFTKEDY 
KKKLLiDLEIiAPSASVVLKP/ ALFINF * AGRPTAS IVHSSSGDI W 

TLI/STVLYPFliAIWRIjlSNFLFSriPPPTQTSVRVTSSEPPNPAS 
SSKSEKREPVRKRVLEKRGDDFKKEGKIYRLRTQDDGEDENNTW 
NGNSTQQM 



5885 



900 



467 



AAGGGRRSRLSRSWPTGPSKSPSGVRCCG\RR\AWEDKDEFIiDV 
IYWFRQIIAVVI/3VIWGV1,PIJIGFLGIAGFCLINAGVI*YLYFSN 
YLQ I DEEEYGGTWELTKEGFMTSFA/ IVHGHLDHLIiHCHPL * LM 
VYSSQVLPI QS KGPS 



5886 



86 



1341 



PFRGRAIiTLKKQPRPGVAPPSLGTCHKSDPGRPAAQSQPPSPGS 
GTFGLLSFRKVHiTKlVJTLKKHFVGYPTNSDFEliKTSELPPLKNG 
EVLLEALFLTVDPYMRVAAK^LKEGDTMMGQQVAKVVESKNVAIj 
PKGTI VLAS PGWTTHS I SDGKDLE KLLTEWPDT I PLSLALGTVG 
MPGLTAY FGLL E I CGVKGGE TVMVNAAAGAVG S VVGQ IAKL KG C 
KWGAVG S DE KVAYLiQKLG FDVVFNYKTVES LEE TL»KKAS P DG Y 
DCYFDNVGGEFSNTVIGQMKKFGRIAICGAISTYNRTGPLPPGP 
PPEIGIYQE3jRMEAFVVYRW(^DARQKAIiKDLLKWVI*EI*PYFVI 

d*lqantlvyksmksakpsleyiseklvsg\kiqykeyiiegfe 
nmpaafmgmlkgdnlgktivka 



5887 



1937 



104 



APGCRGCRAXRCPCRGPRWDSLGDEAARS PAAPGGAPGIiLGLRE 
RPDRCKPGGDDRGPQLHRGSPG /S PSELSRRPGPPGIiPGLQGPP 
PAPGIiPQSRTL/ PVLCVCDIiS PAQCDINCCCDPDCSSVDFSVFS 

acsvpvvtgdsqfcsqkaviyslnftanppqrvfelvdqinpsi 
fcihitn\*nlhyplliqky1,/nennfdtlmktsdgftlnaesy 
vs fttkld i ptaakye ygvpliqtsdsflr f p sslts s lctdnnp 
aaflvnqavkctrxinleqceeiealsmafysspeilrvpdsrk 
kvpitvqs I viqslnktltrredtdvlqptlvnaghfslcvnw 

LEVKYSLTYTDAGE VTKADLS FVLGTVS S VWPLQQKFE IHFLQ 
ENTQPVPLSGNPGYWGLPLAAGFQPHKGSG 1 1 OTTNR YGQliT I 
LHSTTEQ DCLAXiEG VRTP VL FG YTMQSG C KUR.L»TG AL P CQL VAQ 
KVKSLLWGQGFPDYVAPFGNSQGP/AD^DWVPIHFITQSFNRK 
DS CQLPGAIjVI E VKWTKYGSUUN PQAK1 VNVTANLI S S S FPEAN 
SGNERTI LIS TAVTFVDVS APAEAG FRAP PAINARLPFNFFFPF 
V 
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SBQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid' 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C- Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G^Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
} L= Leu cine, M=Methionine, N=Asparagine , 
\ P=Proline, Q=Glutamine, R=Arginine, 
S^Serine, T=Threonine , V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown , *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 


5888 


375 


2302 


LIiCRTPGVAMQRADSEQPSKRPRCDDSPRTPSNTPSAEADWSPG ' 
LELHPDYKTWG P EQVCS FXjRRGG FEEP VLLKNIRENE I TGALLP 
CXDESRPEOTX^SSI/SERKKIiSyiQRLVQIHVDTMKVINDPIH 
GHIELHPLLVRI IDTPQFQRLRYIKQLGGGYYVFPGASHNRFEH 
S LGVG YLAG CL VHALG EKQ PELQ I S ERD VLCVQ IAGLCHDLGHG 
PFSHMFDGRFIPIiARPEVKWTHEQGSVMMFEHLINSNGI KPVME 
QYGL I P EEDI CFI KEQI VGPLES PVEDSLWPYKGRPENKS FLYE 
I VSN JCRNG 1 DVD KWD YFARDCHHLG I QNN FD YKRF I KFAR V CEV 
DNELRICARDKEVGl^YDMFOTRNSLHRRAYQHKVGNI IDTMIT 
DAFLKADDY I E I TGAGGKKYR I S TAIDDMEAYTKLTDNI FIiEIL 
YSTDPKLKDAREILKQIEYRNLFKYVGETQPTGQIKIKREDYES 
LPKEVASAKPKVLLDVKLKAEDF IVDVINMD YGMQBKNP I DHVS 
FYCKXAPNRAIRI TKNQ VSQLLP \ E KFAEQ\ L IRVYCKKVDRKS 
LYA\ARQYFVQW\CADR\NFT\KPQDGRCY*PPTP*HPQKKGW\ 
NDSTFSPKI PTRLPRRLPKSRV\QLFKDDPM 1 


5889 


1831 


731 


LPAACGRPVTARPRQAPEGRSGRPRDL3PYPPQVFPPRPDRVAI 
VTGGTDGIGYSTAKHLARLGMHVI IAGNNDS KAKQWSKI KEET 
LNDKET* VLLCCPGWLCLWNSSDPPTSASRGAGTTGVHHHFLLK 
FG I FI L \ DLAS MTS IRQFVQKFKMKKI PLHVL INNAGVMMVPQR 
KTRDG TO EHFGIJTYLGHFLLTNIjLIjDTIjKESGS PGHS AR WTVS 
SATHYVAELNMDDLQS S ACYS PHAAYAQS KLALVLFTYHLQRL1, 
AAEGSHVTANVVDPGVVNTDLYKHVFWATRLAKia^ 
DEGAWTS IYAAVTPELEGVGGRYLYNKKF5TKSLHVTYNQKLQQQ 
LWSKS CEMTGVLDVTl* 


5890 

j 
i 

* 
* 
* 


1322 


200 


FRRG WSAAGRAVPVAFCSRI SASS PRRPRGAVRLQSGTEAACRS 
GRPDPRPASAAGGHAGERMSQRDTLVHLFAGGCGGTVGAI L.TCP 
LEWKTRLQS S S VTLY I S E VQDNTMAGAS VNR WS PGPLHCLKV 
ILEKEGPRSLFRGLGPNIiVGVAPSRAIYFAAYSNCKEKLNDVFD 
PDSTQVHMI SAAMAG FTAI TATNPI WL I KTRLQI* * /SQGTAG KR 
RMGAFECVRKVYQTDGLKGFYRGMSASYAGISETVIHFVIYES I 
KQKLLE YKTASTMENDEESVKEASDF VGMMLAAATSK\ LVATTI 
AYPHEWRTRLREEGTKYRS FFQTLSLLVQEEGYGSLYRGLTTH 
LVRQ I P \NTAIMMAT YELWYLLNG 


5891 


1322 


200 


FRRGWS AAGRAVP VAFCSR I S AS SPRRPRGAVRLQSGTEAACRS 
GRPDPRPAS AAGGHAGERMS QRDTLVHLFAGGCGGT VG A I I..TCP 
LEVVKTRLQSSSVTLYI SEVQLNTMAGAS VNRWS PGPLHCLKV 
ILEIGiGPRSIjFRGIiGPNIjVGVAPSRAIYFAAYSNCKEKLNDVFD 
PDSTQVHMI SAAMAGFTAI TATNPIWLI KTRLQI.* /SQGTAGKR 
RMGAFBCVRKVYQTDGLKGFYRGMSASYAGI SETVIHFVIYES I 
KQIG^EYKTASTMENDEES VKEASDFVGMMLAAATS K\ LVATTI 
AYPHEVVRTRLREEGTKYRS FFQTLSLLVQEEGYGSLYRGLTTH 
LVRQI P \NTAIMMAT YELWYLLNG 


5892 


1764 


379 


VVLRVCGRI^VNSAVSSRTGGWSAGLTCAMQRLQVVLGHLRGPA 
DSGWMPQAAPCLSGAPHASAADVVVVHGRRTAICRAGRGGFKDT 
TPDELLSAVMTAVLKDVXvTL^PEQLGDICVGNVLQPGAGAIMARI 
AQFLSDI PETVPLSTVNRQCSSGLQAVAS IAGGIRNGS YDIGMA 
CGVESMS LADRGNPGN I TSRLMEKE KARDCL I PMG I TS ENVAER 
FG I S RE KQDT FALAS Q Q KAARAQ S KGCF QAE I VP VTTTVHDD KG 
TKRS ITVTQDEG IR PSTTMEGLAJOjKPAFKKDGSTTAGNS SQVS 
DGAAAILLARRS KAEELGLP I LGVLRS YAWGVP PD I MG IGPAY 
AI PVALQKAGLTVSDVDI FEINE \ AFAS QAAYCVEKLRLPP * EG 
* TPLGGASGP * GHPLGLHWGHVQVI TLAQ * S * SARGKRAYRSGC 
PCAIGSWNGS PLPVFEYPWGT 


5893 


3 


1653 


IIjSKRRCQKAKTKELMAKKVAVIGAGVSGLISLKCCVDEGLEPT 
CFERTEDIGGVWRFKENVEDGRAS I YQSVVTNTSKEMSCFSDFP 
MPEDFPNFLHNSKT.T.EYFRI FAKKFDLLKYIQFQTTVLSVRKCP 
DFSSSGQWKWTQSNGICEQSAVFDAVMVCSGHHILPHIPLKSFP 
GMERFKGQYFHSRQYOCHPDGFEGKRILVIGMGNLGSDIAVELSK 
NAAQVFISTRHGTWVMSRISEDGYPWDSVFlTITiFRSMIiRNVLPR 
TAVKHMI EQQMNRWFNHEireGLEPQNKYI MKEP VLNDDVPSRLL 
CGAI KVKSTVKELTETSAI FEDGTVEENIDVI I FATGYS FS FPF 
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ID 
NO i 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
ajtino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corre sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
{A=Alanine, C= Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K-Lysine, 

P= Proline, Q=Glutamine, R=Arginine, 
S=Serine, ^Threonine, V=Valine, 
W=>Tryp tophan , Y«Tyrosine, X=Unknown, * =Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 




0 




LEDSLVKVENNMVSIiYKY IFPAHLDKSTLACIGI.I QPLGSI FPT 
AELQARWVTRVFKGLCSbPSERTMMMD 1 1 KRNEKRIDLFGESQS 
QTLG/ENYVDYI*DEI*AIiEIGAKPDFCSL^ 

S Y + YRLVG PGQ WEGARNAI FTQKQR I LKPLKTRALKDS SNFSVS 
FI»LKILGLLAVWAFF\ CQLQWS 


5894 


174 


1S73 


RYS PKKVLQNKESSIiKliGMATALVSAHS IiAPLNLKKEGLRWRE 
DHYSTWEQGFKLQGNS KGI^QEPLCKQFRQI»RYEETTGPREAI»S 
RL.REIiCQQWLQPETHTKEHILELLVL.EQFLI I LPKELQARVQEH 
HPESREDVVVVLEDLQLiDLGETGQQVDPDQ PKKQKI LVEEMAPL 
KGVQEQQVRHECEVTKPEKEKGEETRIENGKLrVVTDSCGRVES 
SGKI S EPMEAHNEG SNIjERHQAKPKE KI E YKCSEREQR F I QHIiD 
LIEHASTHTGKKIiCBSDVCQSSSLTGHKKVLS * ERKVIQC\HGV 
LGKAFQRSSHI*VRHQKIHIjGEKPYQCNECGKVFSQNAG13LEHI»R 
IHTGEKPYIiCIHCGKNFRRSSHLNRHQRIHSQEEPCECKECGKT 
FS QAIiIiLTHHQRIHSHS KS HQCNECG KAFS I/TS DLi I RHHRIHTG 
EKPFKCNI CQKAFRLNSHIAQHVRI HNEEKPYQCSECGEAFRQR 
SGLFQHQRYHHKDKLA 


5895 


2967 


86 


HPSLLGAIPFYPPPSSPWPPPIiYLFWNSHRKSRHFINQRGXHGE 
KRLFVSDG\^GCIjPVLAAAGRARGRAEVLISTVGPEDC^^ 
RPKVPVLQIiDSGNYLFSTSAICRYFF\LIjSGWEQDDI»TNQWIiEW 
EATELQPTLS AALY YIi\ WQGKKG \ ED VLGS VRRTLTH I DHSLS 
RQ\NCPFIiAGETESIiAD I VIiWGALYPl»LQDPAYLPEELSAIiHS W 
FOT1^TQ\EPCQR\AARR1jVLKQ\O^VIALR\PYLQKQPQPSPA 
EGKGLSP IEPE3EEIATLSEEE IAMAVTAWEKGLESLP PLRPQQ 
NPVLPVAGERMVLITSALPYVKNVPHIiGN I IGCVLSADVFARYS 
RLRQNNTLYLCGTDEYGTATETKAL \eegltpqe icdkyhi iha 

diy\rwfnisfdifgrtttpqq\tkit\qdifqql»ijkrgfviaqd 
tveqlrcehcarf\ladrfvegvcpfcgybeargdqcdkcgkli 
navelkkpqckvcrscpvvqssqhlfldlpklekrleewlgrtl 
PGSDWTPNAQF I tpffgfrewps kprwq*trdlk\ wgnpgtp* e 
gfedk\vfyvwfdatlgybsi tanytdqwerww \knpbqvdlyq 
fm\akdnvpfhslvfpssalgaednytl\vshi*iateylnyedg 
k\fsksrgvgvfrdm\ahdtgippdisrfyl\lyirpegk\dsa 

FSWTDIjIXKNN S \ ELIjNNIjGNF INRA\ GMF VS KFFGG \ YVPEMV 
LTPDDQRIJA\ir7TLELQIIYIIQ\LIjEKVRIRDAliRS ILTIS \RH 
GNQYI \QVNEPW\KRIKGSEADRQRAGTVTGLAVNIAAIiLSVML 
QPYMPTVSAT I QAQLQLP P PACS I IiLTNFIjCTLP AGHQ I GTVSP 
LFQKLENDQ I ESLRQRFGGGQAKTS P KPA WETVTTAKP QO I Q A 
LMDEVTKQGNI VRELKAQKADKNEVAA5VAKLLDLKKQLAVAEG 
KPPEAPKGKKKK 


5896 


2967 


86 


HPSIOXiAIPFYPPPSSPWPPPIiYLFWNSHRKSRHFINQRGIHGE | 
MRL FVSDGVPGCIiPVIAAAGRARGRAEVIiISTVG P EDCWP FliT 
RPKVPVLQLDSGNYLFSTSAI CRYFF\LI*SGWEQDDLTNQWIjEW 
BATEI*QPTLSAAI*YTIi\ VVQGKKG \EDVLGSVRRTI*TH I DHS I*S 
RQ\NCPFLAGE TESIiAD I VI*WGALY PIjLQDPAYLPEEIjS AliHSW 
FQTLSTQ\EPCQR\AARRLVI,KQ\QGVI*ALR\PY1»QKQPQPSPA 
EGKGLSPIEPEEEEI»ATLSEEEIAMAVTAWEKGLESLPPI*RPCX3 
NPVLPVAGERNVLITSALPYVNNVPHLGNIIGCVLSADVFARYS 
Rl^QWNTLYLCGTDEYGTATETKAL\EEGLTPQEICDKYHI IHA 
DIY\RWFNISFDIFGRTTTPQQ\TKIT\QDIFQQI*LKRGFVLQD 
TVEOLRCEHCARFV LADRFVEGVCPFCG YEEARGDOCDKCGKLI 
NAVEL KXPQCKVC^SCPVVQSSQHLFIiDLPKLEKRIiEEWXiGRTL 
PGSDWTPNAQFITPFFGFREWPSKPRWQ*TRDLK\WGNPGTP*E 
GFEDK\VFYVW FDATIG YI»S I TANYTDQWERWW \ KNPEQVDLYQ 
FM\AKDNVPFHS LVFPSS ALGAEDNYTI* \VSHL I ATE YI»N YEDG 
K\FSKSRGVGVFRDM\AHDTG I PPDISRFYlALY IRPEGK\DSA 
FS WTDLLLiKNNS \ EXiIjNNLGNF INRA\GMFVSKFFGG \ YVPEMV 
IiTPDDQRIiIA\HVTI*ELQHxHQ\ I^EKVRI RDALRS I LT I S \ RH 
GNQYI \QVNEPW\ KRIKGSEADRQRAGTVTGLAVNIAALLS VML? 
QPYMPTVSATIQAQIiQIiPPPACSILLTNFLCTLPAGHQIGTVSP 
LFQKLENDQIESIiRQRFGGGQAKTS PKPAWETVTTAKPQQ I QA 
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SEQ 
NO: 


Predicted 

Vs*»f~!f 1 VI 11 1 FKI 

nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine C=Cvsteine. D=AsDartic Acid* E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=lsoleucine, K=Lysine, 
T»= Leucine, M=Methionine, N=Asparagine, 
P= Proline, Q=Glutaraine, R=Arginine, 
S=Serine, T=»Threonine , V»valine, 
V7= Tryptophan, Y-Tyrosine, X= Unknown, +=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








LMDEVT KQGNI VREI>KAQKAD KNE VAAE VAKIJjDIiKKQLiAVAEG 
K-PPEAPKGXKKK 


5897 


2967 


86 

• 


HPS3JLGAIPFYPPPSSPWPPPLYLFWNSHRKSRHFINQRGIHGE 
MRLFVSIX3VPGCLPVLAAAGRARGRAEVLI S TVGPEDCWPFLT 
RPKVPVLQLDSGNYLFSTSAI CRYFF\LT^SGWEQDDLTNQWLEW 
EATEDQPTLSAALYYIj \ WQG KKG \ EDVLGS VRRTLTHI DHS LS 
RQ \NCP FLAG ETE S LAD I VLW GAL YPLLQD PAYLP E E LSALHS W 
FQTLSTQ \EPCQR \ AARRLVLKQ\QGVLALR\ PYLQKQPQPS PA 
EGKGLS P I E PEEEELATLSEEE I AMAVTAWEXGLESLPPLRPQQ 
NPVLPVAGERNVLITSALPYVNNVPHLGNI IGCVLSADVFARYS 
RLRQWNTLYLCGTDEYGTATETKAL\EEGLTPQEICDKYHI IIIA 
DI Y\RWFNI S FDI FGRTTTPQQ\TKIT \QD I FQQLLKRGFVLQD 
TVEQLRCEHCARF\LADRFVEGVCPFCGYEEARGDQCDKCGKIiI 
NAVEIJCKPQCKVCRSCPWOSSQHLFLDLPKLEKRLEEWLGRTL 

GFEDK\VFYVWFDATIGYLS I TANYTDQWER WW \KN PEQVDLYQ 
FM\AKDNVP FKSLVF P S S ALGAE DNYTL\ VSHL IATE YLNYEDG 
K\ FSKSRGVGVFRDM \AHDTG I PPDISRFYL\LYIRPEGK\DSA 
FSWTDLLLKNNS\ELLNNLGNFINRA\GMFVS KFFGG \ YVPEMV 
LTPDDQRLLA\HVTLE~jQHYHQ\LLEKVRIRDALRS iltis\rh 
GNQYI \QVNEPW\KRIKGSEADRQRAGTVTGLAWIAALL5VML 
QPYMPTVSATIQAQLQLPPPACS ILLTNFLCTLPAGHQIGTVSP 
LFQKLENDQIESLRQRFGGGQAKTSPKPAVVETVTTAKPQQIQA 
LMDEVTKQGNIVRELKAQKADKNrEVAAEVAKTiIaDLKKQIAVAEG 
KPPSAPKGKKKK 


j 5898 


2967 


86 


hpsllgaipfypppsspwppplylfwnshrksrhfinqrgihge 
mrlfvsdgvpgclpvlaaagrargraevli s tvgpedcwpflt 
rpkvpvlqldsgnylfstsaicryff\llsgweqddltwqwlew 
eatelqptlsaalyyl\wqgkkg\edvlgsvrrtlthidhsls 
rq\ncpflagetesladivlwgalypllqdpaylpeelsalhsw 

FQTLSTQ \ EPCQR \ AARRLVL KQ\QG VLALR\ PYLQ KQ PQPS PA 
EGKGLSPIEPEEEELATLSEEEIAMAVTAWEKX3LESLPPLRPQQ 
NPVLPVAGERNVLITSALPYVNNVPHLGNI IGCVLSADVFARYS 
RLRQWNTLYLCGTDE YGTATETKAL\ EEGLTPQE I CDKYHI IHA 
DIY\RWFNI S FDI FGRTTTPQQ\TKI T\QDI FQQLLKRGFVLQD 
TVEQLRC^HC^u^FXLADRFVEGVCPF^XsYEEARGIOQCDKCGKLI 
NAVELKKPQCKVCRS CP WQS SQRLiFLDLPKLEKRLB E WLGRTL 

1VlCT>WT'01^0T7T^WT?T?riVPFTOP t ?TrPRWO*TRDLK\WGNPGTP*E 

GFEDK\VFYWFDATIGYLSITANYTDQWERWW\KNPEQVDLYQ 
FM\AKDNVPFHSLVFPSSALGAEDNYTL\VSHLIATEYLNYEDG 

K\FSKSRGVGVFRDM\ AHDTG I PPDI SRFYL\LYIRPEGK\DSA 
FSWTOU^KNNS\ELIJWLGNFlNRA\GMFVSKFFGG\YVPFjMV 
LT PDDQRLLA\ HVT LELQHYHQ \ LLEKVR I RDALRS ILTIS\RH 
GNOYl\OVNEPW\KRIKGSPJU>RQRAGTVTGLAVNlAALLSVML 
QPYMPTVSATIQAQLQLPPPACSII^TNFXCTLPAGEQIGTVSP 
LFQKLENDQ I E SLRQRFGGGQAKTS P KPAVVETVTTAKPQQIQA 
LMDE VTKQGN I VRELXAQKAD KNEVAAEVAKLLDLKKQLAVAEG 
KPPEAPKGKKKK 


5899 


326 


1078 


KCPKSKEPNGVRAPSLPSPLRAAMALSDVDVKKQIKHMMAFIEQ 
EANEKAEEIDAKAEEEFNIEKGRLVQTCRI*KIMEYYEKKEKQIE 

QQIOaLMSTMR^QARLKVIjRARNDLIS^ 

EVYQGI*LDIUjVIjQGIiLRLLEPVM I VR CRP \QDLLLVEAAVQKAI 
PEYMTISQKHVEV\QIDKEA*LAVECSWEWJEVYSGNQRIKVSN 

TLESRLDLSAKQKMPEIRMALFGANTNRKFFI 


5900 


64 


1409 


kaasrdspclefcplcgvsshdlqhrmwyhrlshlhsrlqdllk 
ggviypalpqpnfksllplavhwhhtasksltcawqqhedhfel 
kyanivmrfdyv^pjdhcrsascynskthqrsldtasvdlcikp 
kti rldettlf ftwpdghvtkydlnwlvkns yegqkqkviqpri 
lwnae i yqqaqvps vdcqs fletneglkkflqnflilyg iafven 
vpptqehteklaeri sl i reti ygrmwyftsdfsrgdtaytkla 

LDRHTDTT YFQE PCG I QVFHCLKHEGTGGRTLLVDG FYAAEQVL 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D^Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine , G=Glycine, 
H-Histidine, I=Isoleucine, K= Lysine , 
^Leucine, M=Menhionine, K=Asparagine, 
P=Pxoline, Q=Glutamine, R=Arginine, 
S=*Serine, T=Threonine, V=Valine, 
W=Tryptophan, V-Tyrosine, X= Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








Q KAP EEFEUjS KS Al \ KHE Y I EDVGECHQ PHDWDWAQS * I S THG 
/ YKELY 1*1 RYNNYDRAVINTVPYDVVHRWYTAHRTLTI ELRRPE 
NEFWVKLKPGIWLFIDNWRVIJlGRECFTGYRQLCGCrirLTRDDV^ 

NTARLLGLQA 


5901 




2121 


VAI EQTSIjKMKQAVGGAPARPTGEY I CNQCGAKYTS LDSFQTHJb 

KTHLDTVL? KLTCPQCNKE FPWQESLLKHVTI HFMITSTY Y I CE 

SCDKQFTSVDDLQKHIADMHTFVFFRCTLCQEVFDSKVSIQIjHIj 

\AVKHSNEKKVYRCTSCNl\T)FRNEIT)IiQLHVKHNI^ 

CI FCGESFGTEVELQCHITTHSKKYNCKFCS KAFHAI ILLEKHIi 

REKHCVFETKTPNCGT^IGASEQVQKEEVELQTLLTNSQBSHNSH 

DGS EEDVDTS EPMYGCDI CGAAYTMETLLQNHQLRDHN I RPGES 

AIVKKKAELIKGNYKCNVCSRTFFSENGLREHMQTHI^PVTO-riTM 

CP I CGERFPSLIiTLTEHKVTHSKSU>TGNCR I CKMPLQSEEEFI* 

EHCQMKPDLRNSLTCF^CWCMQTVTSTLEIiiaHGTFHMQKTGN 

GSAVQTTGRGQHVQKl»YKCASCLKEFRSKQDLVKIiDINGtiPYGL 

CAGCVPTLS KSAS PGINVPPGTNRPGLGQNENLSAIEGKGKVGGI> 

KTRCS * LATFKF* VLKVELPE PHPKPFHRGVS RPDSNSTQLKTP 

QVS PMPRI S P SQSDE KKTYQCI KCQMV FYNEWD I QVH VANHK I D 

EGLNHECKLCSQTFDSPAKLQCHLI EHSFEGMGGTFKCPVCFT V 

FVQANKLQOHI FSAHGQEDK1 YDCTQCPQKFFFQTEI*QNHTMTQ 

HSS 


S902 


712 


209 


LKNRRRSRPS I RQS I GSTS VS RWLTS LFT YliDHTAD VQ * V* REF 
IPLXPRQ* ED* MFQSWIiHAWGDTLEEAFEQCAMAMFGYMTDTGT 
VEPLQTVEVETQGDDLQS LLFHFLDEWLYKFSADEFFI P \GWGE 
EFSI*SKHPQGTEVKAITYSAMQVYNEENPEVFVTIDI 


5903 


2106 


735 


DTPGPSLPSTTAPFSLRSLiSFPSRPSYLLPGDPQPIiOGRGDPTT 
PAjLFAIjSAVPGGAASPMPPSGLRLLPIiLIiPLLWIjLVIjTPGRPAA 
GI*STCKTI DMELVKRKRI EA1 RGQ I LSKLRIiAS PPSQG EVPPG P 
LPEAVI,ALYNSTRDRVAGESAEPEPEPEADYYAKEVTRVIJ4VET 
HNE I YDKFKQS THS I YMF FNTS ELRE AVPE P VI*LSRAEIjRL»IjRI» 
KLKVEQHVEL YQKYSNNS WRYLSNRLLAP SDS PEWLS FDVTGW 
RQWLSRGGEIEGFRJ^AHCSa3SRDN'I*LQVDINGFTTGR\RGDL 
ATIHGMNRPFLLIjMATPI*ERAQHIjQS\SRHRQA1j\DTNY\CFSF 
HGGRNCLRC/VHC*HLIFRKDL\GW\KWl\HE\PKGYHANFC\Ii 
GPCPYIWSIiDTQYSKVLAIjYNQ\HKPG\ASAAP \ CCVPQALEP \ 
LPIVYY\VGRK?KVEQLSNMIVRSCKCS 


5904 


3 


1126 


MMEEIENAINTFK^QRLIYEEIilKEEKTTNNELSAISRKIDTW 
ALGNSETEKAFRAISS KVPVDKVTPSTLPEEVLDFEKFLQQTGG 
RQGAITODYDHQNFVKVRNKHKGKPTFMEEVLEHI.PGKTQDEVQQ 
HEKWYQKFLALEERKKES IQIWKTKKQQKREEI FKUCEKADNTP 
VLFHNKQEDNQKQKEEQRKKQKIiAVEAWKKQKS I EMSMKCASQL 
KEEEEKEKKHQKERQRQFKI»KXLLES YTQQKKEQEEFLRIiE KEI 
REKAEKAEKRKNAADE 1 S RFQERDLHKLELKI LDRQAKEDE KSQ 
KQRiU^iO^KEKVENNVSRDPSRLY/NTHQRLGRTNQKDRTNRLW 

ATSTYPT*GYSNLETRNTEKSMR 


5905 


287 


2912 


MAS FP PRUNE KB IVRLRT I GELLAPAAP FDKKCGRENWTVAFAP 
DGS YFAWSQGHRTVKLVPWS QCXQNFLLHGTKNVTNSSS LRLPR 
QNSDGGQKNKPREHI IDCGDIVWSLAFGSSVPEKQSRCVNIEWH 
RFRFGQDQLLLATGI^SGRIKJ:WDVYTGKIJLLNliVDHTGVVRD 
TFAPDGSI>ILVSJ^RDKTIiRVWDLRDDGN\MMKVLRGHQNWVY\ 
SCAFS PDSSMLCS VGAS KAWAAI L V* I»RLC WHHSHT3ATM VLiS 
WAERVAS LATGliGATFT I G * SNLAFVLQG VL YVHRCWSMSTFCF 
SFFLFFFFKVI SPTVKYH * LLSKLI FQFYGIGSLTSETNLM * S I 
WLSNGFSVLPFG II>SDSRDI L»RL* PNLKFVLI FF * K* CIVS VQK 
KKKPKR IALLQBERLS * DKP PSSHL I *Q/TEVNT RI LFRAI LHS * 
I*LIFRI*NCI*TYS*IIDPFYIC^TYDRG*FGKNKMVKF llr FIEM 
* LYYFHKIAFSFaJW*HPCOiPKKFHLAVNILFACS ICFSS * A 
QVGDPS1*L*TSDYLKGRCQWSNNLLTLRFLSVYFFKNLWSGKK 
REGGL* YI/TLFI SVYFS * LVFGINGFQYSFVVKI>HCL.YFMFRX.I 
FKLTFNRNI *NRI CMSAL1NLKTDFNLTMTLSI FFKLUC I YNA* 
YNLN* I * QF+ YKMCMFVI>CMSE*SYNI CI*FIAGF\LWNMDKYTM 
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SEQ 
ID 
NO: 


Predicted 

beginning 

nucleotide 

location 

cor r e spending 

to first 

amino acid 

residue of 

amino acid 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A= Ala nine, C=Cysteine, D=Aspartic Acid, E- 
Glutamic Acid, F-Phenyl alanine, G«*Glycine, 
H=Histidine, I=Isoleucine, K=* Lysine, 
L=Leucine, MssMethionine, N=Asparagine, 
P=Proline, Q=Glutaraine, R=Arginine, 
S=Serine, T=Threonine , V=Valine, 
W= Tryptophan , Y=Tyrosine, X= Unknown, +=Stop 
Codon, /=possible nucleotide deletion, 
\=spossible nucleotide insertion) 








IRKLEGKHHDWACDFS PDGALLATAS YDTRVYI WDPHNGDILM 
BFXSHIJPPPTPIFAGGANDRWVRSVSFSHDGIjHVASIADDKMVR 
FWR I DEDYP VQVAPLSNGLCCAFSTDGS VliAAGTHDGS VYFWAT 
PRQVPSLQKLCRMS IRRVMPTQEVQELP I PSKLLEFLSYRI 


5906 


146 


2038 


REGAGSGRMASGA\ YNP YIEI I EQPRQRGMRFRYKCEGRSAGSI 
pgehstdnnrtyps IQI MNYYGKGKV\RI TLVTK\NDP YKPHPH 
DLVGKDCRD\GYYEAEFGQE\RJiP\ljFFQN\ljGIRC\nCKKEVKE 
A\ I ITR\ I KAGINPFDVP*KQLNDIEDCDLDVVRXWFRVFI*PDG 
HGNIA TTALPP V\ VSS P I YDNRAFNTAEIiR VCR VNKNCGS VRGG 
DE I FI*I*CDKVQKDDI EVR FVLN DWEAKG I FSQADVHRQVAIVFK 
TPPYCKAITEPVTV^QLRRPSDQEVSESMDFRYIiPDEKDTYGN 
KAKKQKTTIjIiFQ KLCQDHVETG FRH VDQDG LELLTSGDP PTLAS 
QSAG ITVNFPERPRPGI*LGS IGEGRYFKKE PNLFSHDAWREMP 
TGVSSQAESYYPSPGPISSGLSHHASMAPLPSSSWSSVAHPTPR 
SGNTNPLSS FSTRTLPSNSQG I PPFIiRXPVGNDLNASNACI YNN 
ADDrVGMEASSMPSADLYGISDPNMLSNCSVNMMTTSSDSMGET 
DNPRliIiSMKLENPS CNS VIiDPRDLRQLHQMS SS S MSAGANSNTT 
VFVSQS DAFEGSDFSCADNSM I NESG PSNS TNPNSH VFVQDSQ Y 
SGIGSMQNEQI^SDSFPYEFFQV 


5907 


99 


1873 


TyiJjSSWSS**Nl,DTKIKSQVKV/RKGHKKISWPYPQPAKQNGK 

KATS KV PSAPHFVHPNDHANREAEljKKKWVREMREKQQAAREQE 

RQKRRTIESYCQDVLRRQEEFEHKEBVLQELNMFPQIiDDEATRK 

AYYKEFHKVVEYSDVILEVIJDARDPL^CRCPQMEEAVIiRAQGNK 

KLVLVLNKIDLVPKEVVEKWIjDYIiRNELPTVAFI^ 

NRCS VPVDQASESLLKS KACFGAENLMRVIiGNYCPIiGBVRTHIR 

VGWGLPNVGKSSIiINSLKRSRACSVGAVPGI TKFMQEVYLDKF 

IRI>LDAPGrVPGPNSEVGTILRNCVHVQKLADPVTPVETILQRC 

NLEE I SNYYGVSG FQTTEH FLTAVAHRIiGKKKKGGIj YSQEQAAK 

AVLADWVSGKI SFYTPPPATHTLPTHLSAElVKEMTEVFDIEDT 

EC^NElXr^CIATGESDElJjGEm^ ! 

ENKTTVYKIGDLTt5YCrrNPNRHQMGV7AKRNVDHRPKSNSMVDVC 

S VDRRSVLQR I METDPLQX^QAliAS ALKNKKKMQKRADKI AS KL 

SDSMMSAIiDLSGNADDGVGD 


5908 


247 


975 


HCGIKKRGEGSGSPSPASGGFQDGCQIP3PSLPSEEETHPHTRA 
HTRTLRAT1iTRRPPRSHSTRI*RFPMPI*IX3DGGIjASWK/PMRER* 
GWRR P AKAAG AS LG VAATG KRG CRM S KRYLQKATKGKLLI 1 1 FI 
VTLWGKWSSANHHKAHHVKTGTCEVVAIjHRCCNKNKI eepjsqt 
VKCS CFPGQVAGTTRAAPSCVDASIVEQKWWCHMQPCLEGEECK 

vlpd'rkgwscssgnkvkttrvth 


5909 


1 


1 5002 


PAIPGSTIIWAPGSHSAARADGRHGSLPSQSQAPGALCGARAPP 
S S NLRADRS M I CAQ ARAG KNLYKNRFLGLAAMAF P S RNS QS liRR 
CKEP IRYS YNPDQFHNMDLRGGPHDGVTIPRSTSDTDLVTS DSR 
STLMGRSSYYSIGHSQDLVIHWDIKEEVDAGDWIGMY1»IDEV1»S 
ENFLDYKNRGVNGSHRGQI I WKI DAS SYFVE PETRI CFKYYHGV 
SGALRATTPSVTVKNSAAPIFKS I GADETVQGQGS RRI* I S FSLS 
DFQ AMGLKKGMFFNPDP YLKI S I QPGKHS I FPALPHHGQERRSK 
I IGNTVNP I WQABQFSFVSLPTDVIiE I EVKDKFAKSRP 1 1 KRFL 
GKLSMP VQRI*I*ERHAI GDR WS YTLGRRljPTDHVSGQbQFRFE I 
TSSIHPDDEEISLSTEPESAQIQDSPMNNLMESGSGEPRSEAPE 
SSESWKPEQLGEGSVPDRPGNQS I ELSRPAEEAAVITEAGDOGM 
VSWSPEGAGELtAQVQKDIQPAPSAEEIAEQLDLiGEEASAIJbLE 
DGEAPASTKEEPLEEEATTQSRAGREEEEKEQEEEGDVSTLEQG 
EGRIiQU^VKRKSRPCSLPVSELETVIASACGDPETPRTHYIR 

IHTLLHS M PS AQGGSAAEEEDGAE EESTLKDS S EKDGLS EVDTV 
AADPSAIiEEDREEPEGATPGTAHPGHSGGHFPSLANGAAQDGDT 
HPSTGSESDSSPRQGGDHSCEGCDASCCSPSCYSSSCYSTSCYS 
SSCYSASCYSPSC^GNRFASHTRFSSVDSAKISESTVFSSQDD 
EEEENSAFESVPDSMQSPELDPESTNGAGPWQDEIAAPSGHVER 
S PEGLES PVAGPSNRREGECP ILHNSQPVSQLPSIjRPEHHHYPT 
IDEPI^PIWEARIDSHGRVFYVDHVNRTTTWQRPTAAATPDGMR 
RSGSIQQMEQLNRRYQNIQRTIATERSEEDSGSQSCEQAPAGGG 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence • 


Predicted end 
nucleotide 
location 
cor re s ponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide - 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E- 
Glutamic Acid, F= Phenylalanine, G^Glycine, 
H=Histidine t I«=Isoleucine, K= Lysine, 
L*-Leucine # M^Methionine, N=Asparagine , 
P= Proline, Q= Glut amine , R=Arginlne, 
S=Serine, T=Threonine, V=valine, 
W= Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\spossible nucleotide insertion) 








GGGGSDSEAESSQSSLDLRREGSLS PVNSQKTTLLLQS PAVKF1 
TN P E FFTVLHANYSAYRVFTSSTCLKHMI LKVRRDARNFER YQH 
NRDLVNFINMFADTRLELPRGWEI KTDQQGKSFFVDHNSRATTF 
IDPRI PLQNGRLPNHLTHRQHLQRLRS YSAGBASEVSRNRGASL 
LARPGHS LVAAI RSQHQHESLPLAYNDKI VAFLRQPNI FEMLQE 
RQPSLARNHTLREKIHYIRTSGNHGLEKI^CDADLVILLSLFEE 
EIMSYVPLQAAFHPGYS FS PRCS P CSS PQNS PGLQRASARAPS P 
YRRDFEAKLRNFYRKLEAKGFGQGPGKIKL1 IRRDHLLEGTFNQ 
VMAYSRKELQRJJKLYVTFVGEEGLDYSGPSREFFFLLSQEIjFNP 

yyglfeysandtytvqispmsafvenhlewfrfsgrilgXlali 

HQYLLDAFFT\RPFYKAI*L\RI j PC\D\LSDI,EYIJ)EEFHQSLQW 

mkdnnitdildltftvneevfgqvterelksggantqvteknkk 

EYIERKV'KWRVEKGWGQTFJ^IjVRGFYEVVDSRIjVSVF'DARELE 
L V I AGTAE I DLNDWRNOTTE YRG3 YHDGHLVIRWFWAAVERFNNE 
QRLRLLQFVTGTSSVPYEGFAAPPWE PMGLRRFIiP * KKWGKITS 
LPPRG\HTCLQPDWDLPTVSPRTPMI>YEK\LLTA\VEETSTFGT 


5910 


1526 

- 


446 


VAEFAAMEPGRTQI KLDPRYTADLXjEVLiKTNYG I PSACFSQPPT 
AAQLLRALGPVELALTSILTXiLAIjGS iai fledavylykntlcp 
I KRRTLLWKSSAPTWSVLCCFGIiWI PRS Ij VXiVE mt its fyavc 
FYI>IiMI>VMVEGFGGKEAVLRTI>ROTPMMVHTGPCCCCCPCCPRL 
LLTRKKLQ \R+CWALSNTPS * R* R* PWWACFSS PTASMTQQTFL 
RGAQLYGSTLSSA/CSITjLALWTLGI isrqarlhlgeqnmgakf 
ALFQVLL I L.TALQ PS I FSVLANGGQ IACS PP YS S KTR3QVMNCH 
LLILETFI^TVLTRJ/IYYRRKDlIKVGYETFSSPDI^LmjKALRWM 
AWTMKGCCTH 


5911 


109 


595 


QIjPLAPC IQGKGLEMRS PKPQS FI IRS SHSGAGLLVKNPSTPVF 
CGHRRGGAAFKYKPTPWGPEQRPTGQKHMRGGVSLLSPRLECS 
GTISAHCNLRLPSSSNS PAPAS * LAG I TGVCHHAQL I FVFLVET 
GFHHVGQAGLELL/NWIHLPRPPECVLGLQA 


5912 


924 


277 


M I LNKAIJ41iGALALTT VMS PCGGED I VADHVAS YGVNL YQS YG P 
SGQYSHE FDGDEE FYVDLER KETVWQLP L FRRFRRFD P QFALTN 
IAVLKHNIoNIVIKRSNSTAATNEVPEVTVFSKSPVTLGQPNTLI 
CLVDNI FPPWNITWLSNGHSVTEGVSETRPSSPKSDHFI JjQDQ 
VTSPSFPFE* *DL* TAKVEQLGAWFEPLLKHWGAE IPTTL 


5913 


46 


1198 


QljRMAGAEGAAGRQSEI^PVVSLVDVLEEDEELENEACAVLGGS 
DSEKCS YS QGS VKRQALYACS TCTPEGEE PAGI CLACS YECHGS 
H KLFEL YTKRN FRCDCGNS KFKNLECKLL PDKAKVNSGNKYNDN 
FFGLYCICKRPYPDPEDEIPDEMIQCWCEDWFHGRHLGAIPPE 
SGDFQEKVCQACMKRCSFLWAYAAQIAVTKIST\GMMDWCGTI»M 
E* /DDQEVIKPENGEHQDSTLXEDVPEQGKDDVREVKVEQNSEP 
CAGS S SESDLOTVFKCTCS LNAESKSGCKLQELKAKQLI KKDTAT 
YWPLNWRS KLCTCQDCMKMYGDLD VL FLTDEYDTVLAYENKGKI 
AQATDRSDPLMDTLSSMNRVQOVELIC/G IQ* FED 


5914 


960 


124 

• 


NLGGS ELPPEEALFIQVASMNQRRVDFYLAS IEDMLVAI / GGRN 
E1IGALSSVETYSPKTDSWSYVAGLPRFTYGHAGTIYKDFVYISG 
GHDYQIGPYRKNLLCYDHRTDVVTBERRPMTTARGWHSMCSIiGDS 
I YS IGGSDDN IESMERFDVLGVEAYS PQCNQWTRVAPLLKANSE 
SGVAVWEGRIYILGGYSWEOTAFSKTVQVYDREADKWSRGVDLP 
KAIAGGSACFIAP*SLGQRTRKRKAKARGTRTGASDPSCASWDH 
PHRHTj PGXCRPAATS , 


5915 


1604 


703 


FPGRPTRPLKLGRRRKRARI IQAPHCIISFRPRTCPPGALQAPEA 
PASRAEGPVAVWNGHTEGPAPARSAPKEPPGLPRPLGSFPCPT 
PQEDFPALGGPCPPRMPPS PGFSAWLLKGTPPPPPPGLVPP IS 
KPPPGFSGLLPSPHP\PVSPAPPPPPPQK/RPRLLPAP/PGLPS 
PRELPGEE PS AHPVHQGLPAERRG PLQRVQBPLRGVQTGPDLRS 
PVLQELPGPAGGEFPEGL* * AAGPAAH 


. 5916 


256 


633 


SPRMWEI WGP WHRWESFSLEGEWPSRI PEPS PDSTKGTSGKGCR 
TVTGAVHRHLNHVAGIIPWVLHSQLKPTAATAQDQWTSQQYPDH 
PTRLI LQ * NQATAD KNN * TTALLQPHQRL \ VS PRMAEA 


5917 


1343 


827 


AHQILTYLEP/ ICLWNYNKILTVFLTKSVLEI *KFIHTPQTYR 
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ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=^Alanine, C=Cysteine, D=Aspartic Acid, E=> 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lyeine, 
L=sLeucine, M=Methionine, N*Asparagine , 
P=Proline, Q=*Glutatnine, R=Arginine, 
S=Serine, T= Threonine, V= Valine, 
W=Tryptophan, Y- Tyrosine, X=Unknovn, *-Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








?*NDFFGIKEVYVSRRLRKTS F/RIAVTFLEQAWSKECVPVDQ 
?MEHLL PS LLS LAS D PVP1WR VLXjAKALRQMLLB KAY FRNAGNP 
HLEVIEETILALQSDRDQDVS FFAALEPKRRNI IDTAVLEKQN 


591B 


13 


1247 


EGAQVARRRSRRQWRAGRCGRGRGGRRAERTGGRGPPGRPR PLP 
PGPARRGRRRMBTP FYGDEAIjSGLGGGASGSGGTFAS PGRLFPG 
APPTAAAGS MMKKDAIjTLSLS EQVAAALKPAPAPAS YPPA\ ADG 
APSAAPPDGLLASPDLGLLKLASPKT.RRLI iqsnglvtttptss 
QFLYPKVAASEEQEFAEGFVKALEDLKKQNQU3AGRAAAAAAAA 
AGGPSGTATGS AP PGEIiAPAAAAPEAP VYA\NI»S S Y \ AGGCRG1* 
RGGAAT\VAFAAEPVPFPPPPPPGALGPRRP/RIAIiQGRRPQTV 
PDVP\SFGESP\PLSPIET\DTPRRI\KAKRKRD\RNPQIRAPK 
PAS RKLG AQ S RALE R ES EDPS * S PEHGS LASTASLLREQVAQLK 
QKVLSHVNSGCQLLPQHQVPAY 


5919 


1 


4254 


TS VOGDSOGTPTSSQGS INMEHW I SQAI HGSTTSTTSSSSTQSG 
GS GAAHRLAD VMAQTH I ENHSAPPDVTTYTSEHSIQVERPQGST 
GSRTAPKYGNAELMETGDGVPVSSRVSAKIQQLVNTLKRPKRPP 
LREFFVDDFEELLEVC^PDPNQPKPEGAQMLAMRGEQLGVVTNW 
PPSIiEAALQRWGTISPKAPCLTIWDTNGKPIjYILTYGKLWTRSM 
KVA Y S I LHKLG TKQE P MVRPGDR VAL V F P NND P AAFMAAFYG CL 
LAE WPVP I EVPLTJRKDAGSQQIG FLlJGS CGVTVALTS DACHKG 
LPKSPTGE I PQFKGWPKLLWFVTESKHLSKPPRDWF\ PHI KDAN 
NDTAYIEYKTCK\DGSV1X3VTVTRTALLTHCQALTQACGYTEAE 
TI VNVLDFKKDVGLWHGILTSVMNMMHVIS I P YSLMKVNPIiS WI 
QKVCQYKAKVAC^KSRBMHWALVAHRDQRDINLSSLRMLIVATC 
ANPW S ISSCDAFLNVFQS KGLRQEVI CPCAS S PEALTVAI RRPT 
DDSNQPPGRG VLSMHGLTYGV I RVDS EEKLS VLT VQDVGLVM PG 
AIMCSVKPIX3VPQLOITOEIGELCVCAVATGTSYYGLSGMTKNT 
FE VFAMTSSG API SE YPF I RTGLLG FVG PGGLVFVVGKMDGLMV 
VSGRRHNADD I VATAIA VBPMKF VYRGRIAVFS VTVLHD ERIVT 
VAEQRPDSTEEDS FQWMSRVLQAIDS IHQVGVYCLALVPANTLP 
KTPLGG IHLS ETKQLFLEG S LHPCNVIjMCPHTCVTNLP KPRQKQ 
PE IGPASVMVGNLVSGKR I AQASGRDLGQ I EDNDQARKFLFIiSE 
VLQWRAQTTPDHI L YTLLN CRGA I ANS LT CV QLHKRAE K I AVML 
MBRGHLQDGDHVALVrTPGIDLIAAFYGCLYAGCVPITVRPPHP 
CNIATTLPTVKMIVEVSRSAa^TTQLICKLLRSREAAAAVDVR 
TWPLILDTDD* PKKRPAQ I CKPCNPDTLAYLDFSVSTTGMLAGV 
KMSHAATSAFCRSIKLQCELYPSREVAICLDPYCGI/5FVLWCLC 
SVYSGHQSILI P P SEL.E TNP ALW LLAVS Q YKVRDTFCS Y S VMEL 
CTKGLGSQTESLKARGLDI,SRVRTC7VVAEERPRIALTQSFSKLi 
FTOLGLHPRAVSTSFGCRVNIiAICLQGTSGPDPTTVYVDMRAIjR 
HDRVRLVERGS PHS LPIjME SGKIL PGVR 1 1 IANP ETKGPLGDSH 
LGE I WVHS AHNAS GYFTI YGDESLQSDHFNSRLSFGDTQTI WAR 
TGYI^FLRRTELTDANGBRHDAIiYWGALDEAMELRGMRYHPID 
IETSVIRAHKSVTECAVF^WTNLLVVVV2U^SEQEALDLVPLV 
TNVVT^EHYLI VGVVVVVD IGVI P INSRGEKQRMHLRDGFIaADQ 
LDPIYVAYNM 


5920 


1381 


1499 


QLGAVAHAGVSRI PP* LFPPliHPTFLSLWCLHHKLP/HPPGASM 
VRPPWPRRPPAHISSVRQASTQVPRTVPHTQRVANIGTQTTGP 
SGVGCCTPGRPLLPCKCS3AAHSTYRVQEPAVH1PGQEPLTASM 
IiAAAPLHEQKQMIGER^YPLIHDVHTXJIiAGKITGMLIiEIDNSEL 
LLMLSS PESLHAKI DEAVAVLQ AHQAME QPKAYMH 


5921 


727 


157 


VCPGTGGE*GLWGQIX5GX>PKETPLKPMDAFTGSGLKRKFDDVDV 
GSSVSNSDDEISS SDSADSCDSLNPPTTAS FTPTS ILKRQKQLR 
RKNVRFDQ VTVYYFARRQG FTS VPSCGGS SLGMAQRHNS VRS YX 
IX^FAQEQEVNHREILREHLKEEKLHAKKMKLTKNGTVESVEAD 
GLTliDDVSDEDIDVENVEVDDYFFI^PLPTKRRRALLRASGVHR 
I DAEEKQ ELRAI RLS REECG CD CR LY CD PEACACS Q AG I KCQVD 
RMSFPCGCSRDGCGNMAGR I EFNPrRVRTHYLHTIMlCLELESKR 
Q\GAAQQPQ\ *GALPDCQLQPDRSTGL+ DPS WIGSKGLS FTGKG 
AAATHLI I LRVT ENRGAEGKRK 


5922 


2475 


49S 


S YSNWGLFPS VFIQVPRSRTGNLKPIFLFYS YYE \ CMETLKG \T 
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NO: 
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corre spondi ng 
to first 
amino acid 
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Predicted end 
nucleotide 
location 
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amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C= Cysteine, D=Aspartic Acid, E=' 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=I>ysine, 
I>=Leucine, M=Methionine , NsAsparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=»Threonine, V=Valine, 
W=Tryptophan # Y- Tyrosine , X= Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 






• 


CIiYNATQY KVCSPRNDR PDACYN PSE PAATTVFEIRTGLLLGDT 
SKI I TRTEEKE I PKQI TLRFDACAAINSKKLEIGCGSLN * ERS + 
RVENKYVCHESGVCKNCAYWPCVI * AT * KKNKNDSVYLQKGEAN 
PSCAAGHCNPLRLI I TN PLDPH WKKG ER VTLG INRTG LKPQWI 
LI KGEVHKCS PXPVFQTPYEELNLPAPEI^KKTlQ^FIiQIAENV 
I FLLNGTS C YVRGGTT IGDRWPWEA* ELVPTD PAPD 1 1 P I * KAE 
ASNF* VLKTS 1 1 RQYC IAREGKDF 1 1 P VGKPNC I GQKL YNSTTK 
TIT* * DIiNHTEKNPFSKFSKIiKTA*AHAESH* DWTVPSGLY* IC 
RHRAYFRXiPNKWADSCVIGTI KPSFFLLP I KMGELLGFS VYASR 
EKKGIVIGNWKDNEWPRERIIQYYGPATWAQDGSWGYR/TP/VY 
MLNWI I RLQAILEI I SNETGRALTVLAWQBTQMRNAI YQNRIAL 
DYLLVAEGGVCRKJ^LTNCCIXJINBQGQVVKNIVRDMTKIJ^ 
IQVWHKFDPESLFGKWFPAIGGFKTLIVGVLLVIRTClxLLPCVL 
PLIiFQMIKG IVATLVHQKTSAHVKYMNHYRSISQRDSKSEDESE 
NSH 


5923 


137 


638 

* 


QLCGRRGQRFRTS IKRMHPI * RTCPNTNL/ 1 ILLSQENTQIRDL 
QQENRELWI SLEEHQnALEI»I MSKYRKQMLQLMVAKKAVDAE PV 
LKAHQSHS AE I ESQ I DR I CEMGEVMRKAVQ VDDDQ FCK I QEKLA 
QLELENKELRELLS ISSESLQARKENSMDTASQAI K 


5924 


274 

» 


2146 


EKGKVKDAGAEQWISLSI*SCKGSWETQ?SNHI*NSI*TPPTSVRRM 
PLITTVTI^KMVARHHKKIiLCSKAFSTQDCX3KIFLHSQMGIHH0 
SVCMKLKPNTSHIISILMGQPKALVQnETLAPLTI I IQKFQTQD 
HMKFWKNLPLHSHHLTPSVPQTVIPKKTGSPE I KLKITKT I QNG 
RELFESSLCX3DLLNEVQASE \Q* NQS IESRXEKRKKSNKKDSSR 
SEERKSHKIPKLEPEEQNRPNERVDTVSEKPREEPVIjKEGSPSS 
ANTIFCSNNGSVHW\FKFQVGDLVWSKVGTYPWWPCMVSSDPQL 
EVHTKINTRGAREYHVQFFSNQPERAWVHEKRVREYKGHKQYEE 
LLAEATKQASNHSEKQKIRKPRPQRERAQWDIGIAHAEKALKMT 

REERI EQYTFI YIDKQPEEALSQAKKS VASKTEVKXTRRPRS VI* 
NTQPEQTNAGEVASS LSSTE IRRHSQRRHTSAEEEEPPPVKI AW 

KTAAARiCSJjPAS ITMHKGSIiDLQKCNMS PWKI BQVFALQNATG 
DGKFI DQFVYSTKG I GNKTE I SVRGQDRL I ISTPNQRNEKPTQS 
VSSPEATSGSTGSVEKKQQRRS IRTRSESEKSTEWPKKKI KKE 
QVGFLHVES 


5925 


216 


1911 

i 


MMTAESREATGLSPQAAQEKDGIVIVKVEEEDEEDHMWGQDSTL 
QDTPPPDPE IFRQRFRRFCYQNTFGPREALSRLKELCHOWLRPE 
INTKEQILEDLVLEQFLSIIiPKEIiQVWLQEYRPDSGEEAVTLLE 
DliE LDLSGQQVPGQVHG PEMLARGMV PLD P VQESS S FDLHHEAT 
QSHFKHSSRKPRIMSRAI*PAAHI PAPPHEGSPRDQAMASALFT 
ADSQAMVKI3DMAVSLI LEEWGCQNIARRNLSRDNRQENYGSAF ; 
PQGGEINRNENEESTSKAETSEDSASRGETT<JRSQKEFGEKRI)QE 
GKTGERQQKNPEEKTRKEKRDSGPAIGKDKXTITGBRGPREKGK 
GLGRS FS LS SNFTTP E EVPTGTKSHRCDE CGKCFTRS S S I*IRHK 
I IHTGEKP YECSECGKAF\SLNS \NLVLHQRI\HTGEKPHECNE 
CGKAFSHSSNIilliHQRIHSGEXP YECNECGKAFSQS SD\ LTKHQ 
R IHTGEKP YECSECGKAFNRNS YI*I LHRR VHTREKP YKCTKCG K 
\AFTRSSTLTLHHRI HARERASEYS PAS LDAFGAFLKS CV 


5926 


2 


233 


DRCLMLKQGSQPGSPPAT/CEPPAPPVYQAPCQSCPEPPGAIIEP 
SDSPHHTPVHPPPEHSAACPAPATCCPPPRSSMS 


5927 


4146 


1248 


KHFS KFGSQAIiYQI#KRPASGQNS I SVMPAQKI TKPAAKYGI PLA 
Y KKYGDKKLHE KKPLQ KHKQAH QT PEKR VNTGE ERR K I S E E AAR 
KRRLEFIEKEKKQKDQI I SLMKAEQMKRQEKERLERINRAREQG 
WRNVLSAGGSGEVKAPFLGSGGT IAPS S FS S RGQYEHYHAI FDQ 
MQQQRAEDNEAKWKRE I YGRGLPERQKGQIAVERAKQVEEFTiQR 
KREAMQNKARAEGHMG I LQNIAAMYGGRPSSSRGGKPRNKEEEV 
YLARIjRQI RLQNFNERQQI KAKLRGEKKEANHSEGQEGSEEADM 
RRKK\ IESLKAHANARAAVLKEX)IiERKRKEAYEREiCKVWEEHIiV 

AKGVKSSD VS P PIiGQHETGGSPS KQQMRS VISVTSALKEVG VBS 
SLTDTRETSEEMQKTNNAISSKKEIIJUUiNENlJCAQEDEKGKQN 
LSDTFEINVHEDAKEHBKEKSVSSDRKKWFJ^GQLVIPLDELTL 
IXTSFSTTFJUm^GEVIKI/SPNGSPRP^WGKSPTDSVLKII/SEAE 
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ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
ainino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amxno acid segment containing signal peptide 1 
lA=Alanine, C= Cysteine, D=Aspartic Acid, B= ! 
Glutamic Acid, F= Phenyl alanine, G^Glycine, 
H-Histidine, I=lsoleucine, K= Lysine, 
L= Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glut amine, R=Arginine, [ 
S=Serine, T=Threonine , V=Valine, 1 
W= Tryptophan, Y-Tyrosine, X=Unknown, *=Stop | 
Codon, /-possible nucleotide deletion, 

\ = DOSSib2. e nucl eot~if?P incpyf ir»r»l I 








LQLO;i'tiiJ*ENTTXRSEISPEGEKYKPLITGEKKVQCISHEINPS 
AI VDSPVETKS PEFSEAS PQMSLKLEGNLEEPDDLETE ILQEPS 
GTNKDE\SLPCTITDVWISEEKETKBTQSADRITIQENEVSEDG | 
VS S TVDQLSD IHIEPGTNDSQHS KCDVDKS VQPEP FFHKWHS E 
HLNLVPQVQS VQCS PEES FAFRSHSHLPP KNKNKNSLLIGLSTG 
LFDANNP KMLRTCSLPDLSKLFRTLMDVPTVGDVRQDNLE I DE I 
EDENIKEGPSDSEDIVFEETDTDLQELQASMEQLLREQPGEEYS 1 
EEEES VLKNS DVE PTANGTDVADEDDNP S S ESALNEE WHS DNS D 
GEIASECECDSVFNHLEELRLKLEQEMGFEKFFEVYEKIKAIHE 1 
DEDENI E I CS KI VCNI LGNEHQHLYAKI LHLVMADG AYQEDNDE | 


5928 


4146 


1248 


KHFSKFGSQALYQLKRPASGQNS 1 S VMPAQKI TKPAAKYG I PLA 
YKKYGDKKLHEKKPLQKHKQAHQTPEKRVNTGEERRKISEEAAR 
KRRLEFIFJ<EKKQKDQIISLMKAEQMKRQEKERLERINRAREQG 
WRNV1*SAGGSGEVKAPFLGSGGTIAPSSFSSRGQYBHYHAJFDQ 
MQQQRAEDNEAKWKRE I YGRGLPE RQKGQLAVERAKQVE EFLQR 
KREAMQNKARAEGHMGILQNLAAMYGGRPSSSRGGKPRNKEEEV 
YI^LRQIRIjQNFl^RQQIKAKLRGEKKEANHSEGQEGSEEADM 
RRKK\IESLKAHANARAAVLKEQLERKRKEAYEREKKVWEEHLV 
AKGVKS SDVS PPLGQHETGGS PSKQQMRSVT S VTS ALKE VG VDS { 
SLTDTRSTSEEMQKTNNAISSKREILRRLNENLKAQEDEKGKQN 
LSDTFE INVHEDAKEHEKEKSVSSDRKKWEAGGQLVI PLDELTL I 
DTS FSTTERHTVGEVIKLGPNGS PRRAWGKS PTDS VLKI LG EAE 
LQLQTELLENTTIRSE I SPEGEKYKPLITGEKKVQCISHE INPS j 
AIVDS P VETKS P E FSEAS PQMSTjKLEGNLEEPDDLETE ILQE PS \ 
GTNKDE\SLPCTITDVWISEEKF/TKETQSADRITIQENEVSEDG j 
VSSTVDQLSDIHIEPGTNDSQHSKCDVDKSVQPEPFFHKWIISE [ 
HLNLVPQVQSVQCSPEESFAFRSHSHLPPKNKNKNSLLIGLSTG 
LFDANNP KMLRTCS LPDLS KLFRTLMDVPTVGDVRQDNLE I DE I 
EDENI KEG PS DS ED I VFEETDTDLQE LQASMEQLLREQPGEE YS j 
EEEESVLKNSDVEPTANGTDVADEDDNPSSESALNEEWHSDNSD 
GEIASECECDSVFNHLEELRLHLEQEMGFEKFFEVYEICIKAIHE 1 
DSDENIE I CS KI VQNI IX3NEHQHLYAKI LHLVMADGA YQEDNDE | 


5929 


3 


1558 


LDFSMTTQLPAYVA I LLF YVS RASCQDTFTAAVYEHAAI LP NAT j 
ill VvfaKiilvAJjALr^RNXuJlLEGAITSAADQGAHIIVTPE f 
WNFNRDSLYPYLEDI PDPEVNW I PCNNRNRFGQTPVQ2RLS CL\ 
AKNNS I YWANI GDKKPCDTSD PQCP PDGR YQYNTD WF\DS QG j 
KLVARYHKQNLFMGENQFNVPKEPEIVTFNTTFGS FGI FTCFDi 
LFHDPAVTLVKD FHVDTI VFPTAWMNVLPHLSAVE FHS AWAMGM j 
RVNFIASNIHYPSKKMTGSGI YAPNSSRAFHYDMKTEEGKLLLS 
QI^SHPSHSAVVNWTSYASSIEALSSGNKEFKGTVFFDEFTFVK 1 
LTG VAGNYTVOQ KDLCCHLS YKMS EN I PNEVYAIiGAFDGLHTVE 
GR YYLQI CTLLKCKTTNLNTCGDSAETASTRFEMFSLSGTFGTQ 
YVFPEVLLSENQLAPGEFQVSTDGRLFSLKPTSGPVLTVTLFGR 
LYE KDWASNASSG LTAQAR 1 1 ML I VI AP I VCS LS W j 


5930 


113 


6082 

i 


RGNCFW I V P FTMAQRTGLEDPER YLFVDRAV I YNPATQAD WTAK 
KLVWI PS ERHGFEAAS I KEERGDE VMVE LAENGKKAMVNKDD I Q 
KMNPPKFSKVEDiMAELTCLNEASVLHNLKDRYYSGLIYTYSGLF 
CVVINPYKNLPIYSENIIEMYRGKKRHEMPPHIYAISESAYRCM 
LQDREDQS ILCTGESGAGKTENTKKVIQYLAHVASSHKGRKDHN 
I PGE\LERQLLQANPI LESFGNARTVQNDNSSRFGKF IRINFDV 
TGYIVGANIETYLLEKSRAVROAKDERTFHI FYOLI*SG\ AGEHL 
KSDLLLEGFNNYRFLSNGYIPIPGQ\QDKGNFRGDPGEA1>1HIKG 
FSHEEILSMLKWSSVUJFGNISFKKERNTDC^MPEOTVAQKL 
CHLLGMNVME FTRA I LTPRIKVGRD YVQ KAQTKEQAD FAVEALA 
KATYERLFRWLVHR I NKALDRTKRQGAS FIG ILDIAGFE I FELN 
SFEQLCINYTNEKLQQLFNHTMFILEQEEYQREGIEWNFIDFGL 
DLQPCI DL I ERJ>ANP PGATIALIJ3EECWFPKATDKTFVE KLVQEQ 
GSHSKFQKPRQLKDKADFCIlOTAGKVDYKADEWl^KNMDPl^^ 
NVATLLHQS SDRFVAELW KDVDR I VGLDQ VTGMTETAFG SAY FCT 
KKGMFRTVGQLYKESLTKLMATLRNTNPN FVRCI I PNHEKRAGK \ 
IJ^PHLVLDQIiRCNGVIJEGIRICRQGFPNRIVFQEFRQRYEILTP | 
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ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid. F= Phenyl alanine, G=Glycine, 
H=01istidine, I=Isoleucine, K- Lysine, 
L^Leucine, ^Methionine , N^Asparagine, 
P=Proline, G>=Glut amine, R=Arginine, 
S=Serine, T«Threonine, V=Valine, 
W=Tryptophan, Y= Tyro sine, X= Unknown , *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








NAI PKGFMIX3KQACBRMIRALEl J DPJ^YRIGQSKIFFRAGVIxAH 
LKKKRDLK I TDI 1 1 FFQAVCRGYlxARKAFAKXQQQI>SAL,KVLQR 
NCAAYI^RHWQWWRVFTKVK^LLQVTRQ 

EKO^CVEGEljEEMERKHQQIJ^EEKNII^QL<3AL"rKLFAEAEEM 

RAR1JVAKKQELEEILHDLSSRVEEEEERNQILQNEKKKMQAHIQ 

DI£EQUDBEEGAR£KWI£KVTAEAKZKKM£E^ 

I KEKKLMEDRIAECSSQLAEEEEKAKNLAK1RNKQBVM I SDLEE 

RIiKKEEKTRQELEKAKRKLDGBTTDLQDQIAEl^AQIDEIiKL^ 

AKKEEELQGAIJUIGDDETIjHKNNALKVVREI^AQIAELQEDFES 

EKASRNKAEKQKRDLS EELEAL KTELEDTLDTTAAQQELRTKRE 

QEVAELKKALEEETKNHEAQIQDMRQRHATALEEIiSEQLEQAKR 

FKANLEKNKQ^LETDNKELACEVKVLQQVK^ 

QEIOlAKVSEGDRliRVEIiAEKASKI^NEIiDNVSTLI^EEAEKKGI K 

FAKDAASI^SQLQDTQELLQEETRQKIjNLSSRIRQLEEEKNSI-iQ 

EG^EEEEEARKNIiEKQVLAIiQSQLADTKKKVDDDLGTIESLEEA 

KKKLLKDAEALSQRLEEKAIAYDKLEXTKNRI^ 

HQRQVASNLEKKQ \ KKFDQLLAEEKS I SARYAEERDRAEAEARE 

KETKALS LARALEEALEAXE EFERQNKQLRADMEDIiMSS KDDVG 

KNVHELEKS KRALEQQ V\ E EMRTQLE ELED ELQ ATEDAKLRLEV 

NMQAMKAQFERDU2TRDEQNEEKKRLLI KQVRELEAELEDERKQ 

RALAVAS KKKME I DLKDLEAQIEAANKARDEVI KQLRKLQAQMK 

DYQRE1*EEARASRDEIFAQSKESEKKLKSLEAEILQLQEEIiASS 

ERARRKAEQERDEI^EITOSASGKSALLDEKRRLEARIAQI*EE 

ELEE EQSNMELLNDRFTUCTTI^VDTIjNAELAAERS aaqksdnar 

QQLERQNKELKAKI^ELEGAVKSKFKATISALEAKIGQI*EEQL»E 

QSAKERAAANKiVRRTEICKLKEIFMQVEDERRHADQYKEQMEKA 

NARMKQLKRQLEEAEEEATRANASRRKLQRELDDATEANEGLSR 

BVSTLKNRIjRRGGPISFSSSRSGRRQIjHI»EGASLEL»SDDDTESK 

TS DVNETQP PQS E 


5931 


113 


6082 

• 

* 


rgncfwivpftmaqrtgledperylfvdraviynpatqadwtak 
klvwi pserhgfeaasikeergdevmvelaengkkamvnkdd i q 
kmnppkfs kvedmaeltclneas vlknlkdryysg l i ytysg l f 
cwtnpyknlpi ys eni iemyrgkkrhempphi yai sesayrcm 
lqdredqs i lctgesgagktentkkv1qylahvasshkgrkdhn 
i pge\lerqllqanp i les fgnartvqndns srfgkf irhtfdv 
tgyi vganietylleksravrqakdertfhifyqli/sgvagehl 

KSDLLLEGFNNYRFLSNGYIP I PGGAQDKGNFRGDPGEAMHIMG 
FSHEEILSMLKWSSVLQFGNIS FKKERNTDQASMPENTVAQKL 
CHLLGMNVMEFTRAI LTPR I KVGRD YVQKAQTKEQADFAVEAIiA 
KATYERIJRWLVHRINKALDRTKRQGASFIGILDIAGFEIFELN 
SFEQLCINYTNEKLQQLFNHTMF I LEQEE YQREGIE WNFIDFGL 
DLQPCIDL IERPANP PGVLALLDEECWFP KATDKTFVEKLVQEQ 
GSHSKFQKPRQUCDKADFCI IHYAGKVDYKADEWDMKNMDPLND 
NVATLLHQS SDRFVAE LW KDVDR IVGLDQ VTGMTETAFGSAY KT 
KKGMFRTVGQL YKESLTKLMATLRNTNPNFVRC 1 1 PNHEKRAGK 
LDPHLVLDQLRCNGVLEG I RI CRQGFPNRI VFQEFRQR YE ILTP 
NAI P KG FMDG KQ ACE RM I RAL E LD PNLYR I GQS KI F FRAG VLAH 
LEEERDLKT TDI 1 1 FFQAVCRG Y1*ARKAFAKKQQQI*S AI»KVt*QR 
NCAAYLKLRH WQWWRVFTKVKP LLQ VTRQEE ELQAKDEELLKVK 
EKQTKVEGELEEMERKIIQQLLEEKNI LAEQLQAETELFAEAEEM 
RARIAAKKQBLEEILHDLESRVEEEEERNQILQNEKKKMG^IQ 
DLEEQi^EEEGARQKI^LEKVTAEAKIKKMEEEILLLEDQNSKF 
IKEKKLMEDR I AECSSQLAE EE E KAKNLAK I RCTKQEVMI SDLEE 
RLKKEEKTRQELEKAKRKLDGETTDLQDQIAELQAQIDELKLQL 
AKKEEEUJGALARGDDETIJIKNNAIJCVVREI^ 
EKASRNKAEKQKRDLSEELEALKTELEDTLDTTAAQQELRTKRE 
QEVAELKKALEEETKNHEAQIQDMRQRHATALEELSEQLEQAKR 
FKAKLE KNKQG LETDNKE LACEV KVLQQ VKAES EHKRKKLDAQ V 
QELHAKrV"SEGDRLRVEIxAEKASKLQNEI»DNVSTI^^ I K 
FAKDAASLESQLQDTQELLQEETRQKLNLSSRIRQLEEEKNSLQ 
EQQBEEEEARKNliEKQVIiALQSQIiADTXKKVDDDIfGTXESLE£A 
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ID 
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Predicced 

beginning 

nucleotide 
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c orrespondi ng 

to first 

amino acid 

residue of 

amino acid 

sequence 


predicted end 
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location 
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residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
{A=Alanine, C«Cysteine, D=Aspartic Acid, E» 
Glutamic Acid, F=Phenyl alanine, G=Glycine, 
H=Histidine, I^Isoleucine, K= Lysine, 
L- Leucine, M^Mcthionine, N=Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine , T= Threonine , V=Valine, 
W=Trypt ophan , Y=Tyrosine , X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








KXKI^KDAEAI^QRI^KAIAYDKLEOT 

HQRQVAS NLE KKQ \ KKFDQLIiAEE KS I S AR YAEERDRAEAEARE 
KETKAI»SIiARAT iRH AT »RAKEBFERQNKQLRADMEDLMSS KDDVG 
KNVHEIiE KS KRALE QQV\ EEMRTQ LEEIiEDEX*QATEDAKLRX.EV 
NMQAMKAQFERDLO/TRDEQNEEKKRLIi IKQVRELEAEIiEDERKQ 
RALAVAS KKKME IDL KDLEAQ I EAANKARDEVrKQLRKLQAOMK 
DYQRE IjEBARASRDE I FAQSKESEKKLKSLEAEILQLQEELASS 
ERARRHAEQ ERDELADE ITNSAS G KSALLDEKRRLEARI AQLEE 
ELEEEQSNMELI^RFRKTTI^VDTLtlAEIJU^RSAAQKSDNAR 
QQLERQNKEliKAKLQELEGAVKSKFKATI SALEAKlGQIiEEQLE 
QEAKERAAANKDVRRTEKKLKEIFTOVEDERRKTU^QYKEQMEKA 
KARMKQLKRQLEEAKEEATRANASRRja*QREI^DATEANEGL»SR 
EVSTLKNRIiRRGGP I SFSSSRSGRRQLHI»EGASLELSDDDTESK 
TSDVNETQPPQSE 


5932 


33 


Si2 


RHLEEICFLFLQKGRKLKI>SGPRWEEGKPRGTGGbWVKAEANMG 
FGATLAVGLTIFVLSWTI I ICFTCSCCCLYKTCRRPRPV\APP 
PHP P/ PWHAPYPQPPSVPPSYPGPS YQG YHTMPPQPGMPAAPY 
PMQ YP P P Y PAQPMGP PAYHETLAGGAAAP YPASQPP YNPAYMDA 
PKAAIi 


5933 


1 


3190 


GTRKIiKMADKTPGGSQKASSKTRSSDVHSSGSSDAHMDASGPSD 
SDMPSRTRPKSPRKHNYRNESARESLCDSPHQNLS11PLLENKLK 
AFS IGXMSTAKRTLS KKEQEELKKKEDE KAAAE I YEE FLAAFEG 
SIX3NKVKTFVRGGVVNAAKEEHETDEKRGKI YKPSSRFADQKNP 
PNQSSNroPPSLLVIETKKPPLKKGEKEKKKSNIiEIiFKEEIiKQI 
QEERDERHKTKGRIiSRFEPPQSDSDGQRRSMDAPSRRrrR.SSGVL. 
DDYAPGSHDVGDPSTT\NFY1>GNI\NPQMNLKKCCCCEFGRFGP 
IjASVKIMWPRTDEERARERNCX3FVAFMFn^DAERAIjKNI>NGKMI 
MSFEMKIjGWGKAVPIPPHPIYTPPSMMEHTLPPPPSGLPFNAQP 
RERLKWPNAPM1.PPPKNKEDFEKTLSQAIVKWIPTERNIiLALI 
HRMIEFVVREGPMFEAMI^INREINOTMFRFLFENO/^PAHVYYRW 
KLYSILQGDSPTKWRTEDFRMFKNGSFWRPPPLNPYIiHGMSEEQ 
ETEAFVEEP S KKGALKE EQRDKLEE I L.RGLTPRKND IGDAMVFC 
LNNAEAA^IVDCITESLSILKTPLPKKXARLYIiVSDVLYNSSA 
KVANASYYRKFFETKLCQIFSDLNATYRTIQGHIiQSENFKQRVM 
TCFRAWEDWAIYPEPFIilKLQNIFLGLVNIIBEKEl'EDVPDDLD 
GAPIEEELDGAPLEDVDG I P IDAT P I DDLDGVP I KSUDDDLDGV 
PLDATEDSK1CNEPIF1CVAPSKWEAVDESELFJVQAVTTSKWELFD 
QHEESEEEENQNQEEESEDEEDTQSSKSEEHKLYSNPIKEEMTE 
SKFSKYSE^EEKRAKLPJEIELKVMKFQDELESGKRPKKPGQSF 
QEQVBHYRIJKLLQREKEKEIjERERERDKKDKEKIaESRSKDKKEK 
DECTPTRKERKRRHSTSPSPSRSSSGRRVKSPSPKSERSERSER 
SHKESSRSRSSHT03SPRDVSKKAKRSPSGSRTPKRSRRSRSRSP 
KKSGKKSRSQSRSPHRSHKKSK5KTNTGRKFFKKAVTYWKCDLF 

LCPERSVF 


5934 


1 


3190 


GTRKL KMADKT PGG S QKAS S KTRSS DVHS SGSSDAHMDASGPS D j 
SDMPSRTRPKSPRKHNYRNESAPJESLCDS PHONIC RPLIjEKKLK 
AFSIGKMSTAKRTI^KKEQEELKKKEDEKAAAEIYEEFIiAAFEG 
SIX3NKVKTFVRGGVWAAKEEHETDEKRGKIYKPSSRFADQKKP 

PNQSSNERPPSUJVI ETKKPPLKKGEKEKKKSNLEliFKEELKQ X 
QEERI)ERHKTKGRI*SRFEPPQSDSIX3QRRSMDAPSRRlvrRSSGVl, 

DDYAPGSHDVGDPSTT\NFYljGNl\NPQMNI»KKCCCQEFGRFGP 
IJ^VKIMWPRTDEERARERNCGFVAFMNRRDAERAI^KNLNGKMI 
MS FEMKLGWGKAVPXPPHP I YI PPSMMEHTLPPPPSGIiPFNAQP 
RERLKNPNAPMLPPPKNKEBFEKTLSQAIVKWIPTEJiNIjriALI 
HRMIEFVVREGPMFEAMIMNREINNPMFRFLFENQTPAHVYYRW 

KLYSIMGDSPTKWRTEDFRMFKNGSFWRPPPLNPYIiHGPaSEEQ 
ETEAFVEEPSKICGAiKEEQRDKI^II^GLTPRKT^D IGDAMVFC 

LNNAEAAEE I VDC1 TES LS I I*KTPI*P KK IARLYT>VSDVXi YNS S A 
KVANAS YYRKFFE^KI^QIFSDI^ATYRTI QGHLQSENFKQRVM 
TCFRAWEDWAIYPBPFLIKLQNI FLGLVNI I EEKETEDVPDDI>D 
GAPIEEELIX3APLEDVDGIPXDArPIDDIJ3GVPIKSI>DDDI^^ 
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SEQ 
ID 
NO: 


Predicted ! 

beginning 
nucleotide 1 

location j 
corresponding j 
to first 
amino acid 
residue of 1 
amino acid 
sequence f 


Predicted end 
nucleotide 
location 
corre spondi ng 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D^Aspartic Acid, E= 
Glutamic Acid, F=Phenyl alanine, G=Glycine, 
H-Histidine, I=Isoleucine, K= Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glut amine , R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryp t ophan , Y=Tyrosme, X=Unxnown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=rpossible nucleotide insertion) 








PLDATEDSKKNEP I FKVAP SKW EAVDES ELEAQAVTTS KWEL FD 
QHEESEEEENQNQEEESEDEEDTQSSKSEEHHIiYSNPIKEEMTE 
SKFSKYSEMSEEKRAKliREIEbKVMKFQDEIiESGKRPKKPGQSF 
QEQVEHYI^KIjIiQREKEKELEJlEliERDKKDKEKLESRSKDKKEK 
DECTPTRKERXRRKSTSPSPSRSSSGRRVKSPSPKSERSERSER 

SKKESSR5 RSSHKDS PRDVSKKAKRS PSGSRT PKRSRRSRSRS P 
KKSGKKSRSQSRSPHRSHKKSKGKTNTGRKFFKKAVTYWKCDLF 

LCPERSVF 


5935 


3 

- 


4493 

I * 


S Y WLSGWRLS RP PRQFWAGWRG I GRFGTMAPVHGDDCE I GASALi 
SDSGS FVS S RARREKKS KKGRQEAIiE R LKKAKAGERYKYEVEDF 
TGVYE EVDEEQ YS KLVQ ARQDDDW IVDDDG IGYVEDGREI FDDD 
LEDDALDADE KGKDGKARNKDKRNVKKIjAVTKPNN I KSM F I ACA 
G KKTADKAVDLS KDGLLGDI LQDLNTETPQ IT P PPVMI LXKKRS 
I GAS PNP FS VHTATAVPSGK IAS PVS RKE PPLT P VPLKRAEFAG 
DDVQVESTEEE QESGAMEFEDGDFDE PMEVEEVDLE PMAAKAWD 
KESEPAEEVKQEADSGKGTVSYLGSFLPDVSCWDIDQEGDSSFS 
VQEVQVDSSHLPLVKGADEEQVFHFYWI*DAYEDQYNQPGVVFIiF 
GKWIESAETHVSCCVMVKNIERTLYFXjPREMKIDLNTGKETGT 
F I SMKDVYEE FDE KIATKYKI MKFKS KPVE KNYAFE I PDV PEKS 
EY1jEVKYSAEMPQLPQDIJCGBTFSHVFX3TNTSSI»EIjFLMNRKIK 
GPCWliEVKKSTAI^QPVSWCKVEAMAIiKPDLVNVIKDVSPPPLV 
VMAFSMKTMQNAKNHQNEI IAMAALVHHSFALDKAAPKPPFQSH 
FCWS KPKDC I FP YAFKEVT EKKNVKVEVAATERTLLGFF1AKV 
HKIDPDI IVGHNIYGFELEVLLQRINVCKAPHWSKIGRLKRSNM 
PIO^GGRSGFGERNATCGRMICDVEISAKEIjIRCKSYHLSELVQQ 
ILKTERWI PMENIQNMYSESSQLLYI*IjEHTWKDA\KFIIiQI MC 
E^iNVIiPIAIiQITNIAGNIMSRTIjMGGRSERNBFLULHAFYENNY 
IVPDKQIFRKPQQKLGDEDEEIIX3DTNKYKKGRKKGAYAGGLVL 
DPKVGFYDKFILLL.DFNSLYPSIIQEFNICFTTVQRVASEAQKV 
TEDGEQ3QIPELPDPSI.EMGILPREIRKLVERRKQVKQLMKQQD 
I2qpDLII^YDIRQKAIJ<XTANSMYGCLGFSYSRFYAKPIAALVT 
YKBREII*MHTKEMVQKMNI*EVIYGDTDS IMINTNSTNLEEVFKL 
GNKVK£EVNKLYKLLEIDIIX5VFKSL 

GNYVTKQELKGLDIVRRDWCDIAKDTGNFVIGQILSDQSRDTIV 
ENI QKRIj I B IGENVLiNGS VPVSQFE INKALTKDPQDYPDKKS LP 
HVHVALiW I NS QGG RKVKAGDTVS YVI CQDGSNLTAS QRAY A P E Q 
I^KQDNLTIDTQYYIAQQIHPWARICEPIDGIDAVLIATGWEL 
\DPTQFKVHHYHKDBENDAI*IiGGPAQIjTDFJElCYRI)CERFKCPCP 
TCGTEN I YDNVFDGSGTDME PSLYRCSNI DGKAS PLTFTVQLSN 
KLIMDIRRFIKKYYIX5WLICEEPTCP^TRHLPLQFSRTGPI»CP 
ACMKATLQ P EY S DKS LYTQLCFYRY I FDAECALEKb 1 1 uHEKDK 
LKKQFFTPKV^DYRKLKNTAEQFLSRSGYSEVirLSKijFAGCAV 

KS 


5936 

• 


1124 


139 


RGEEQFPAEFRRFACLGFGERLQEFSRIiljRAVHRS RAWTCY LA I 
RMLMATCCPSPTTTACTGPWQRAPPLRLLVQKKEADSSGLAFAS 
NSLQRRKKGLLI^PVAPLRTRPPIAISLPQDFRQVSSVIDVDLL 
PETHRRVRLHKHGSDRPIiG FY I RDGMS VRVAPQG \ LERVPG I FI 
SRLVRGGLAESTGLIAVSDBI LEVNG I EVAG ICTLNQVTDMMVAN 
SHN\LIVTVKPANQRNNVVRGASGRLTGPPSAGPGPAEPDSDDD 
SSDLVIENRQPPSSNGLSQGPPCWDLHPGCRHPGTRSSLPSLDD 
QEQASSGWGSRIRGLAroL»roi» i 


5937 


31 


1600 


' PTSI^KSTVQLMCRLLQDKRYQCVYSIiAE I FKVIAS FY VILVI L 
YGLTSS YSLWWMLRSSUCQYS FEALREKSNYSDI PDVKNDFAFI 
IJUjADQYDPLYSKRFSIFIjSEVSENKLKQINL^EWTVEKLKSK 
LVKNAQD KX ELHLFMLNGLiPDNVF E LTEMEVLS LEL I P EVKX, P S 
AVS QLVNL KE LR VYHS S L WDHPALAFLE EN LKI LRLKFTEMG K 
I PRWVFHLKNLKELYIjSGCVLPEQLSTMQLEG FQDIiKNLRTLYL 
KSSI^RIPQVVTDLLPSLQKLSLDNEGSKLVV^ 
LELISCDLERIPHSIFSI^LHEUDLRF^LKTVEEIISFQHLQ 

nlsclklwhnniayipaqigalsnleqlsldhjwieni.plqlfi. 
ctklhyldlsynhltfipeeiqylXsnlq^ 
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SEQ 

Tn 

NO: 


Predicted 

nucleotide 
location 
cor re spondi ng 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 

location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 

Glutamic Acid, F=Phenyl alanine, G=K*lycine, 
H=Histidine, I=Isoleucine # K= Lysine, 
L= Leu cine , M=Ne thicmi ne , N=Asparagine , 
P=Proline, Q=Glut amine , R=Arginine, 
S=Serine, T=Threonine, Vo Valine, 
W -Tryptophan, Y»Tyrosine, X=Dnknovm ( *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








LFQCKKLQX^LLLG KN S LMNLS PHVGELSNLTHREPIG YNYLETL 
PPELEGCQSIiKRNCbIVEE^IJm,PLPVTERI^ 


5938 


395 


1865 


YKX3EGFFCWQEARGERRKKKKAMSSPNIWSTGSSVYSTPVPSQK 
TVILNNLLEGYDNTCLRPDIGVKPTL^ 

YTIDI FFAQTWYDRPJLKFNST I KVLRLNSNMVGKIW I PDTFFRN 
SKKADAHWITTPNI^IIjRIWNIXniVLYS^ 

MDEHSCPLEFSS YGYPR EBI VYQ WKRSS VEVGDTRSWRL YQFS F 
VGLRNTTEVVKTTSGDYVVMSVY FDLSPJRMG YFTIQT Y I PCTLI 
WLS WVS FW 1 N KDAVP ARTS LG I TTVLTMTTLS T IAR KS LP KVS 
YVTAMDLFVSVCFI FVFSALVE YG \TLHY FVSNRKPS KDKDKKK 
KNPAPTID IRPRSATIQMNNATHLQERDEEYGYECLDGKDCAS F 
FCCTEDCKTGAWRHGRIHIRIAKMDS YARI FFPTAFCLFNLVYW 
VSYLYL 




66 


14 04 


IRPG YLKEV QENS PGHRAGt*EPFrDFI VSINGSRJ^KDNDTL»Ki> 
LLKANVEKPVTQ4LIYSSKTLEXRETSVTPSNLWGGC<IL^ IR 
FCS FIX3ANENVWHVLBVESNSPAALAGLRPHSDYI IGADTVMNE 
SEDLFSLIBTHEAKPLKLYVYNTDTDNCREyi ITPNSAWGGEGS 
LGCGIGYGYLURI PTRPFEEGKKISLPGQMAGTPITPLKDGFTE 
VQI*S S VNPPS LS P PGTTG I EQSLTGLS I S5T P \ PAVSS VLSTGV 
PTVP\LLPPQVNQSLTSVPPMESSY7»HT >PGLMPFTRQGLPNLPQ 
PSTFNLPR\PTHSWPGVGLYQEFVKPGVLPPLSSMPPRNLPG\I 
APLPLPSEFLPSFPLVPBSSSAASSGELLSSLPPTSNAPSDPAT 
TTAKADAASSLTVDVTP PTAKAPTTVEDR VGDS TP VSE KP VSAA 
VDANASESP 


5940 


145 


717 


RRSASRSAS PRQSAGTAVTTCraAGGTCLAAAHHRMRWPJU^GRS 
LEKLPVHMGLVITEVEQEPSFSDXASLVVWCMAVGISYISVYDH 
QG I FKRNNS RLMDEII*KO^QEIjIiGIj«DCSKYSPEFANSNDKDDQV 
r^CmAVKVLSPEDGKADIVRAAQDFCQLVAQKQKRPTDLDVDT 
LA\VYLVQMWIiILI 


5941 


13 


6147 


MCLGRMGASSPRSPEPVGPPAPGLPFCCGGSIJ^VWLLALPVA 
VTCQCNAPEW\LPFARPTNLTDEFEFPIGTYLNYECRPGYSGRPF 
S 1 1 CLKNS VWTGAKDRCRRXS CRNPPDP VNGKVHV I KG IQFGS Q 
I KYSCTKGYRI.IGSSSATCI I SGDTVIWDNETP1CDRI PCGLPP 
TI TNGD FI STNRENFHYGS VVTYRCN PG S GGRKVFEL VGEPS I Y 
CTSNDDQVGIWSGPAPCCI I PNKCTPPNVENGI I»VS DNRSLFS L 
NEWEFRCQPGFVMKGPFJIVKCQALNKWEPELPSCSRVCQPPPD 

VLHAERTQRXJKDNFS PGQEVFYSCEPG YDLRGAASMRCTPQGDW 
S PAAPTCEVKS CDD FMGQI»I*NGRVLFPVNLQIX3AKVDFVCDEG F 
QLKGS S AS YCVLAGMESIiWNS S VPVCEQ I FCPS P PVI PNGRHTG 
KPLE VFP FGKAVNYTCDPHPDRGTSFDI* IGEST IRCTSDPQGNG 
VWSS PAPRCGILGHCQAPDHFL FAKLKTQTNASDFP IGTSLKYE 
CRPEYYGRPFS 1 TCLDNLVWS 3 PKDVCKRJCS CKTP PDPVNGMVH 
VITOIQVGSRINYSCTTGHRLIGHSSAECILSGNAAHWSTKPPI 
CQRJPCGLPPTIANGDFISTNRENFHYGSVVTYRCNPGSGGRKV 
FELVGE PS I YCTSNDDQVG I WSGPAPQCI I PNKCTPPNVENG 1 1* 
VSDNRS LFSLNE VVEFRCQPG FVMKGPRRVKCQALNKWEPEL PS 
CSRVCQP PPDVLHAE RTQRDKDN FS PG QEVFYS CE PG YD LRGAA 
SMRCTPQGDWS PAAPTCEVKS CDDFMGQL LNGRVLFP VNLQLGA 
KVDFVCDEGFQLKGSSASYCTTLAGMESLWNSSVPVCEQI FCPSP 
PVT PNGRHTGKPIYE V FP FGKAVNYTCDPHPDRGTS FDL IGEST I 
RCTSD PQGNGVWS S PAP RCG I LG HOQ APDH FL FAKL KTQTN AS D 
FPIGTSLKYECRPEYYGRPFS ITCLDNLVWSSPKDVCKRKSCKT 
PPDPVNGMVHVITDIQVGSllINYSCrTGHRLIGHSSAECILSGN 
TAHWSTKPPIOQRI PCGLPPTI ANGDFI STNR3NFHYGS WTYR 
CKLGSRGRKVFELVGEPSI YCTSNDDQVG I WSGPAPQCI I PNKC 
TPPNVENG ILVSDNRS LFSLNEVVEFRCQPG FVMKG PRRVKCQA 
LNKWEPBLPSCSRVCQPPPEII*HGEHTPSHQDNFSPGQEV?YSC 
EPGYDLRGAASLHCTPQGDWS PEAPRCAVKS CDDFLGQLPHGRV 
LFPIJSrW}IX3AKVSFVCDEGFRLKGSSVSHCrVLVGMRSLW^SVP 
VCEHIFCPNPPAILNGRHTGTPSGDI PYGKE IS YTCDPHPDRGM 



407 



BNSDOCID: <WO 0153312A1_I_> 

i 



WO 01753312 



r 

PCT/US00/34263 



SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding: 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corre spending 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
{A= Alanine, C= Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H^Histidine, I-Isoleucine, K= Lysine, 
L=* Leucine, M=Methionine, N=Asparagine , 
P= Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine , V=Valine, 
W-Tryptophan, Y= Tyrosine , X= Unknown, * =Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








TFNLIGESTIRCTSDPHGNGVWSSPAPRCELSVRAGHCKTPEQP 
P FAS PTI P I NDFEFPVGTS LNYECRPGYFG KMFS ISGLENLVWS 
SVEDNCRRKSCGPPPEPFNGMVHIOTDTQFGSTVNYSCNEGFRL 
IGS PSTTCLVSGNNVTWDKKAPICE I ISCEPPPTISNGDFYSNN 
RTSFHNGTVVTYQCHTGPDGEQLFELVGERS IYCTSKDDQVGVW 
SSPPPRCISTNKCTAPEVENAIRVPGNRSFFSLTEIIRFRCQPG 
FVMVGS HTVQCQTNGRWGPKLPHCSRVCQ P P PE I LHGEHTLSHQ 
DNFS PGQEVF YS CEPS YDLRGAAS LHCTPQGDVJS PEAPRCTVKS 
CDDFLGQLPHGRVTJjPLNLQLGAKVSFVCDEGFRLKGRSASHW 
LAGMKALWNSSVPVCEQIFCPNPPAILNGRHTGTPLGDIPYGKE 
VS YTCD PH PDRGMTFNLI GEST I RRTS EPHGNGVWSS PAPRCEL 
PVGAAC PHP PKI QNGHYI GGRVSLYL PGMT I S YTCDFGYLLVGK 
GFIFCTDO^IWSQI^HYCICEVNCSFPLF^INGISKELEMKKVYHY 
GDYVTLKCEDG YTLEGS PWSQCQADDRWDPPLAKCTSRTHDALI 
VGTI^GTIFFII^IIFI*SWIIL«KHRKGNNAHENPKEVAIHLHSQ 
GGSSVHPRTLQTNEENSRVLP 


5942 


4509 


688 


YLYVRMRANPLAYGISHKAYQIDPPL\RKHREQ\LVIE\VGRKL 
DK\AQM I RFEERTG YFSSTDLGRTASHYYIKYNTI ETFNELFDA 
HKTEGDIFAI VS KAEEFDQ I KVREEE I EELDTTj T.S NFCBLS TPG 
GVENSYGK^KILLQTYINRGEMDSFSLISDSAYVAQNAARIVRA 
LFE IALRKRWPTMT YRLLNLS KAIDKRLWGWAS PLRQFS I LPPH 
MLTRLEEKKLTVT3XLKIDMRKDBIGHII*HHVNIGLKVKQCVHQIP 
SVMMEAFI QP ITRTVLRVTLS I YADFTWNDQVHGTVGEP WWI WV 
EDPTNDHIYHSEYFLALKKQVISKEAQLLVFTIPIFEPLPSQYY 
IRAVSDRWLGAEAVCI INFQHLILPERHP PHTELLDLQPLPITA 
LGCKAYEALYNFSHFNPVQTQI FHTLYHTDCNVLLGAPTGSGKT 
VAAELAI FRVFNKYPTSKAVYIAPLKALVRERMDDWKVRI EEKL 
GKKV I E LTGD VT PDMKS IAKAD L IVTTP E KWEG VS RS WQNRNYV 
QQVTILI IDEIHLLGEERGPVLEVIVSRTNFISSHTEKPVRIVG 
LSTAI»ANARI)IADWLNIKQMGLFNFRPSVRPVPLEVHIQGFPGQ 
HYCPRMASMNKPAFQAIRSHSPAKPVLI FVSSRRQTRLTALELI 
A?L£TEEDPKQWLNMDEREMENI IATVRDSNLKLTLAFGIGMHH 
AGLHERDR KTVEELFVNCKVQVLi I ATS TLAWGVNFPAHLV 1 1 KG 
TEYYIX5KTRRYVDFPITDVLQMMGRAGRPQFDDQGKAVILVHDI 
KKD FY KK FL YE P F P VE S S LLC VLS DHLNAE I AGGT ITS KQDALD 
YITWTYFFRRLIMNPS YYNLGDVSHDSVNKFLSH1.I EKSLI ELE 
LS YCIE IGEDNRS I E PLTYGR I AS YY YLKHQTVKMFKDRLKP EC 
STEELLS I LSDAEEYTDLPVRHNEDHMNS ELAKCLPI ESNPHSF 
DSPHTKAHLLLQAHLSRAMLPCPDYOTD^ 

DVAANQGWLVTVLN I TNLIQMVI QGRWLKDSS LLTLPN I ENHHL 
HLFKKWKPIMKGPHARGRTSIECLPELIHAOSGKDHVFSSMVES 
ELHAAKTKQAWNFLSHLPEINVG IS VKGS WDDLVEGHNELS VST 
LTADKRDDNKWI KLHADQEYVLQVSLQR VHFGFHKG KPESCAVT 
PRFPKSICDEGWFLILGEVDKRELIALKRVGYIRNHHVASLSFYT 
PEI PGRYI YTLYFMSDCYLGLDQQYD/NLSQRYTSESFCTGQHQ 
GL 


5943. 


1 


2274 

• 


DKPTRHKTYLSSSWAKMAAAEGPVGDGELWQTWLPNHVVFLRLR 
EGLKNQS PTEAEKPASSSLPSS PPPQLLTRKWFGLGGELFLWD 
GEDSSFLWRLRGPSGGG\EEPALSQYQRLLCINPPLFEIYQVL 
LS PTQHHVAL I G I KGLMVT .ET.PKRWGKNSEFEGGKSTVNCSTTP 
VAERFFTSSTSLTLKHAAWYPSEILDPHVVLLTSDNVIRIYSLR 
EPOTPTNVI I LS EAEEES LVLNKGRAYTAS LGETAVAFDFGPLA 
AVPKTLFGQNGKDEWAYPLYILYENGETFLTY ISLLKS PGN/ 1 
WKAVGS IAHAS \ AAEDNYG YDACAVLCLPCVPNI LVIATESGML 
YHCWLEGEEEDDHTSEKS WDSR I DL I PSL YVFECVELELALKL 
ASGEDDPFBSDFSCPVKiaRDPKCPSRYHCTHEAGVHSVGLTWI 
HKLHKFLGSDEEDKDSLQELSTEQKCFVEH ILCTKPLPCRQPAP 
IRGFVJIVPDII^PTMICITSTYECLIWPLLSTVHPASPPIjLCTR 
EDVEVAESPLRVLAETPDS FEKHIRS ILQRSVANPAFLKAS EKD 
IAPPPEE<2I^LIjSRATQVFREQYIIJCQDLAKEEIQRRVKLLCDO 
KKKQL EDLS YCRE ERKS LREMABRLAD KYEEAKEKQEDI MNRMK 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corre sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corre spondi ng 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A-Alanine, C=Cys teine, D=Aspartic Acid, E= 
Glutamic. Acxd, F= Phenylalanine, G=Glycme, 
H=Histidine, I=Isoleucine, K«=Lysine, 
L=Leucine , M=Methionine , N«=Asparagine , 
P= Proline, Q=Glutamine, Rs=Arginine, 
S=Serine, T=Threonine, V- Valine, 
W=Tryptophan, Y= Tyrosine , X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








KIJ^FHSELPV1^DSERDM:<KEI^LIPIX2LRHI^ 
KDYQQQKMEKVLS LPKPT 1 1 LSAYQRKCIQS I LKEEGEH I REMV 
KQINDIRNHVNF 


5944 


167 


342B 


PS I AT FTDEPEVltTBP PS ATTTTT IG I S ATWTTIiAGSHGKRNNT 
ITTTSS KRKNRKWKITPENVQI IPDDPLPIS YSQPEKVNGESKS 
SSTSESGDSDNMR I SSCSDESSNSNSSRKSDNHS PAWTTTVSS 
PCKQPS VLVTFP KE ERKSVSGKAS I KLS ETTSEGTSNSliS TCTKS 
GPS PLS SPNGKLTVASPKRGQKRE EGWKEWRRS KKVSVPSTVT 
SRVIGRGGCNINAIREFTGAHIDIDKQKDKTGDRIITIRGGTE3 
TRQATQL INAIi I KDPDKB I DEDI PKNRLKSS SANSKI GS S APTT 
TAAl^SLMGIKMTTVALS S TS QTATALT VPAI S SASTHKTI KNP 
VN\NVRPGFPVS FP\ IiAYPPPQFAHAIjIiAAQTFQQIRPPRLPMT 
HFGGTFPPAQSTWGPFPVRPLS PARATNS PKPHMVPRHSNQNSS 
GSQVNSAGSLTS S PTTTTS S S ASTVPGT STNGS PSS P SVRRQLF 
VTWKTSNATTTTVTTTASNNNTAPTNATYPMPTAKEHYPVSS ? 
SSPSPPAQPGGVSRNSPLDCGSASPNKVASSSEQEAGSPPVVET 
TNTRPPNSSSSSGSSSAHSNQQQPPGSVSQEPRPPLQQSQVPPP 
EVRMTVPPLATSSAPVAVPSTAPVTYPMPQTPMGCPQPTPKMET 
PAIRPPPHGTTAPHXNSASVQNSSVAVLSVNHIKRPHSVPSSVQ 
LPSTI*S TQSACQNS VHPANKP I APNFSAPL PFG P FSTLFENS PT 
SAHAFV7GGSWSSQSTPESMLSGKSSYliPNSDPI*HQSDTSKAPG 
FRPPLQRPAPSPSG I VNMDS P YGSVTPSSTHLGNFASNISGGQM 
YGPGA PLGGAPAAANFNRQHFS PLSLLTP CSS ASNDS SAQS VSS 
GVRAPS PAPSSVPLGS EKPSNVSQDRKVP VPIGTERSARI RQTG 
TS APSVTGSNLSTSVGHSGI W S FEGIGGNQDKVD WCNPGMGNPM 
IHRPMSDPG VFSQHQAMERDS TG I VTPS GTFHQHVPAG YMDFPK 
VGGMPFSVYGNAMI PP VAPI PDGAGGPI FNGPHAADPSWNSL1 K 
MVS SSTENNG PQTVWTG PWAPHMNS VHMNQLG 


5945 


1461 


197 


GVTKLFIjFGKRKIjRNG IAEDLKGQADFF fli*vsea WATGS PRA 
WLTCLILPLPGIIFSVLPKAMSRPLLITFTPATDPSDLWKDGQQ 
QPQPEKPES TIiDGAAARAFYEAIil GDE S SAPDSQRSQTEP ARER 
KRKKRR IM KAPAAEAVAEGAS GRHGQG R SliE AED KMTHR I LRAA 
QEGDLPELRRLLEPHEAGGAGGlSriNARJDAFWWTPLMCAARAGQG 
AAVS Y1»LGRGAAWVGVCE LSGRDAAQIiAEE AG F P EVARMVRE S H 
GETRSP ENRSPTPS LQ YCENCDTHFQDSNHRTS TAKLIiS LSQG P 
QPPNLPLGVPISSPGFKLLIiRGGWEPGMGLGPRGEGRANPIPTV 
LKRDQEGLG YRSAPQ PRVTHF P AWDTRAVAGRE \TPPRVATLSW 
RE BRRREE \ KDRAWERDLRTYMNLEF 


5946 


541 


1666 


ILGSYSSIQPEEYS \SWC\EWLQDLIA\YVS PK\HSYLRDIiP 
S EGS PQRVNS I DFV\EL\ EHLQPDVLVHAVLRVVDF/T I LTEAV 
YSYRGQKQKKVT4LTVEQAQDQHYALVLWGPGAAW\YPQL,QRKKG 
YIWEFKYLFVQCNYTLENLELHTTPWSSCECIiFDDDIRAITFKA 
KFQKSAPS FVKI SDIATHLEDKCSGWLI KAQI SELAFP ITASQ 
KIALNAHSSLKS IFSSLPNI VYTGCAKCGLEI.ETDENRI YKQCF 
SCIiPFTMKKIYYRPALMTAIDGRHDVCIRVESKLIEKILLNISA 
DCLNRVI VP S S E I TYGMWADLFHS LLAVSAE PCVLK I Q SL FVL- 
DENS Y PliQQDFS LLDFYPD I VKHGANARIj 


5947 


3 


1317 


RGIPDRRRRGPIGRVNMDLENKVKKMGLGHEQGFGAPCbKCKEK 
CEGFELHFWRKIC1WC\NVAKKSM/TVLI^NEEDRKVGKLF3DT 
KYTTL I AKLKS DG I PMY KRNVM I LTN P VAAKKNVS I NT VT YE W A 
P P VQNQALARQ YMQMliPKE KQ P V AGS E.GACJ YKJsJ^Ul-»AxvUlji J AriiJ 
QDPSKCHEI»SPREVKEMEQFVKKYKSEA1X3VOTVKLPCEMDAQG 
PKQMN I PGGDRS TPAAVGAMEDKS AEHKRTQ YS CYCCKLS MKEG 
D PAI YAERAG YDKLiWHPACF VCSTCHELLVDMI Y FWKNBKI*YCG 
RHYCDSEKPRCAGCDELI FSNEYTQAENQNWHLKHFCCFDCDS 1 
1AGE I YVMVND KP VCKP CTVKNHA WCQG CHI^I DP EVQR VTYN 
NFSWHASTECFXCSCCSKCLIGQKFMPVEGMVFCSVECKKRI'IS 


5948 


39 


3370 


YRERYPVSGGSVIiRSAIjEVCWDFLSGLTEGSLLPEGFFSGPIDQ 
GNHYQMRRKGllCHRGSAT^RHPSSPCSVKHSPTRETLTYAQAQRM 

VEIEIEGRLHRIS ifdplei ileddltaqemsecnsnkenserp 

PVCliRT^G^HKNN^VKKKNEALPSAHGTPASASAI»PEPKVR IVE Y 
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ir> 

NO: 


Predicted 
beginning 
nucleotide 
location 
corre spending 
to first 
amino, acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corre spondi ng 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 1 
(A= Alanine, C=Cysteine, D=Aspartic Acid, E- | 
Glutamic Acid, F- Phenylalanine , G^Glycine, 
H=Histidine, I=Isoleucine, K=> Lysine, 
L= Leu cine, M»Methionine, N^Asparagine, 1 
P=Proline, Q=Glutamine, RssArginine, j 
S=Serine, T-Threonine, V= Valine, j 
W=Tryptophan, Y=Tyrosine f X=Unknown . *=stop j 
Codon, /^possible nucleotide deletion, 1 
\=possible nucleotide insertion) j 








S PPSAPRRPPVYYKF IEKSAEEliDNEIVEyDMDBEDYAWIjE IVNE | 
KRKGDCVPAVS QSMFEPLMDRFBKESHCENQKQGBQQS liXDEDA 
VCCI C^IDGECQNSNVTLFCDMCNIAVHQECYGVPYI PEGQWLC / 
. RAHCLQS RARPADCVLCPNKGGAFKKTDDDRWGHV\ VCALW \ I P j 
E \ VGFANTVTI EPI pGVRNI P PARWKLT\ QdCKE KGR/ VGACI 
QCTKANCYTAFHVTCAQKAGLYMKl^PVKELTGGGTTFSVRKTA 
YCDVHTP PG CTRRPLN I YGDVEMKNGVCRKES S VKTVRS TS KVR 
KKAXKAKKALAE PCAVLPTVCAP Y I P PQRLNR IANQVAI QRKKQ 
FVERAHSYWI^KIU^PJ^GAPI^RRI^SSLQSQRSSC^RENDEEM 
KAAKEKXKYWQRIiRHDLERARLLI EIJ^KREKIjKI^VK^EQVA 

MELRLTPLTVLLRSVLDQLQDKDPARIFAQPVSUCEVPDYLDHI 
KHPMDFATMRKRJ^EA^YKNLHEFEEDFDIil IDNOTKYNARDTV 
FYRAAVRLRDQGGVVLRQARREVDS IGLEEASGMHLPERPAAAP 
RR P FSWEDVDRIjIjDPANRAHI/SLEEQLRFXJj DMLDIiTCAM KSSG | 
SRSKRAKLLKKEIALLJINKIjSC^HSQPLPTGPGLEGFEEDGAAIi 

gpfageevlprletij^prkrsrstcgdseveeespgkrldagl 
tngfggarseqepgggi^rkatprrrcasessisssnsplcdss 
FNAPKCGRGKPALVRP^H'LEDRSELI sciengnyakaariaaev 
CQSSmt STDAAAS VLEPLKVVWAKCSGYPS YPALI IDPKMPRV 
PGHHNGVTIPAPPLDVLKIGEHMQTKSDEKXFLVLFFDNKRSWQ 
WLPKSKMVPI/jIDETIDKLKMMEGRNSSIRKAVRIAF^RAMNHXi 
SRVHGEPTSDIiSDID J 


5949 

* 


39 


3370 


YRERYPVSGGS\TLRSALEVCWDFLSGLTEGSLLPEGFFSGPIDQ 
GNHYQMRRRGRCHRGS AARHPS S PCS VXHS PTRETLiTYAQAQRM 
VEIB IEGRLHRI SI FDPLEI I LBDDLTAQEMSECNSNKENS ERP 
PVCLRTKRHKNNRVKKKNEAIiPSAHGTPASASALPEPKVRIVSY 
SPPSAPRRPPVYYKFIEKSAEELDNEVEYDMDEEDYAWIiEIVNE j 
KRKGDCVPAVSQSMFEFLMDRFEKESHCENQKQGEQQSLIDEDA 
VCXTICMDGECQNSNVILFCDMCNLAVHQECYGVPY I PEGQWLC/ 
RAHCLQS RARPAD CVLCPNKGGAFKKTDDDRWGHV\ VCALW \ I P 
E\VGFAmVFIEPIDG\n^ippARWKLT\CNLCKEKGR/VGACI 
QGHKANCYTAFHVTCAQKAGLYMKME P VKELTGGGTTFS VRKTA 1 
YCDVHTPPGCTTRRPIjNI YGDVEMKNGVCRKESS VKTVRSTSKVR 
KXAKKAKKAIAEPCAVLPTVCAPYIPPQRLNRIANQVAIQRKKQ 
FVERAHSYWLLKRLSRNGAPLLRRLQSSLQSQRSSQQRENDEEM 
KAAKEKLKYWQRIJ^DLERARLLIELl^KREKLKREQVKVEQVA 1 
MELRLTPLTVLLRSVLDQIiQDKD PAR I FAQPVSLKEVPDYLDH I 
KHPMDFATMRKRLBAQGYKNLHEFF/RDFDL I IDNCMKYNARDTV 
FYRAAVRLRDQGGVVLRQARREVDS IGI^FJVSSMHLPERPAAAP 1 
PJiPFSWEDVDRLIJ3PANRAHIiGIJ2EQIJlELI^M^^ 
SRSKRAKLLKXEIAI^RNKLSQQHSQPLPTGPGLEGFEEDGAAL 
GPEAGEEVLPRLETLLQPRKRSRSTCGDSEVEEESPGKRIjDAGL 
TNGFGGARSEQEPGGGLGRKATPRRRCASESSISSSNSPLCDSS 
FNAP KCGRGKPALVRRHTLEDRSEL I S CI ENGNYAKAAR I AAE V 
GQSSMWISTDAAASVLEPLKVVV7AKCSGYPSYPALI IDPKMPRV j 
PGHHNGVTIPAPPLDVLKIGEHMQTKSDEKLFLVLFFDNKRSWQ 
WLPKS KMVPLG I DET IDKLKMMEGRNS S IRKAVR I AFDRAMNHI* 
SRVHGEPTSDLSDID 1 


5950 


1166 


373 


ESRSLTMSTSQPGACPCQGAASRPAILYALLSSSDKAVPRPRSR j 
CliCRQHRPVQLCAPHRTCREALDVLAKTVAFLRNIiPS FWQLPPQ 
DQRRLLQGCWG PLFLLGLAQDAVT FEVAEAPVPS I LKKI LLEEP 1 
SSSGGSG<2LPDRPQPSIiAAVQWI^CCLESFWSLEI^PKE \ YACL 
KGPILFNPDVPGIjOAASHIGHI^QEAHWVLC^VI£PWCPAAQGR 
LTRVLLTASTLKS I PTS LLGDLFFRPI IGDVD IAGLLGDMLLLR 


5951 


143 


S449 


WNVKPSLLWQLFKFSDKEEHEQNDS ISGKTGETGVEBMIATRK 
VEQDSKETVKLSHEDDHI LEDAGSSDISSDAACTNPNKTENSLV 
GL PS CVDETVreCWLEIiKDTMGI ADKTENTLERNKI EPI/GYCEDA 
ESl^QI^STEFl^miEVVDTSTFGPESNILENAICDVPDQNSK 1 
QLNAIESTKI ESHE1*ANLQDDRNSQSSSVSYI*ESKSVKSKHTKP [ 
VIHS KQNMTTDAP KICCVAAKYEVIHS KTKVNVKS VKRNTDVPES j 
G^NFHRP\n<VRKKQIDKEPKIC^CNSGVKSVKNQAHSVLK3CTLQ | 
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j SEQ 

1 ID 
| NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
cor respondi ng 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A= Alanine, OCysteine; D=Aspa*rtic Acid, E= 
Glutamic Acid, F=Phenylalanine, G*Glycine, / 
H^Histidine, I=Isoleucine, K« Lysine, 
I>--Leucine, M=Ke thionine , N-Asparagine, 
P=Proline, Q«=Glutamine, R=Arginine, 
S=Serine, T=Threonine , V=Valine, 
W=Tryp t ophan , Y=Tyrosine, X-TJnknown, ,*=Stop 
Codan, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








DQTLVQI FKPLTHSLSDKSHAHPGCLKEPHHPAQTGHVSHSSQK 
QCHKPCXXJAPAMKTJISHVKBELEHPGTO 

LQPRQRRSSKSPSLDEPPLPIPDNIATIRREGSDHSSSPESKYM 
WT P 3 KQ CG F CKKPHGNRFMVG CGRCDD W FHGDCVGI>S hS QAQQM 
GEEDKEYVCVXCCAEEDKKTK I LDPDTLENQATVE FHSGDKTME | 
CEKLGLSKHTTOT3RTKY1DDTVXHKVIQIJCRESGEGRNSSDCRD I 
NEIKKWQIAPLRKMGQPVLPRRSSEEKSEKIPKESTTVTCTGEK I 
ASKPGTHEKQEMKKKICV\EKG^ajNVHPAASASKPSAI^IRQSVR 
HS LKD ILMKRLTDSNLKVPEEKAAKVATKIEKELPS FFRDTDAK 
YKNKYRSIiMFNIiKDP KNNI ItFKKVLKGEVTPDHL I RM S P EEliAS 
KELAAWRRRENRHTI EM I EKEQREVERRP I TXITHKGEI EIESD 
APMKEQEAAMEIQEPAANKSLEKPEGSEK\R3CEEVDSMSKDTTS 
QHRQHLFDLNCKI C IGRMAP P VDDI»S PKKVKVWGVAR KHS DNE 
AES IADAkSSTSNII*ASEFFEEEKQES PKSTFSPAPRPEMPGTV j 
EVESTFIjARIJJFIWKGFINMPSVAKFVTKAYPVSGSPEYLTEDL 
PDS I QVGGRI S PQTVWD YVEKI KAS GT KB I CWRFT PVTEEDQ I 
SYTLLFAYFSSRKRYGVAANW1KQVKDMYLI PLGATDKI PHPLV 
PFDGPGLBLHRPNUjIiGI*I IRQKLKRQHSACASTSHIAETPESA 
PP IALPPDKKSKIEVS TEEAPEEENDFFNS FTTVLHKGRNKPQQ j 
NLQEDIiPTAVEPLMEVTKQEPPKPLRFLPGVL IGWENOPTTLEL 1 
ANKPDPVDDILQSLIjGTTGQVYDQ\AQSVMEQNTVKB I P FLNEQ 
TNSKIEKTDNVEVTDGENKEIKVKVDNI SESTDKSAE I ETS WG 
SSSISAGSLTSLSLRGKPPDVSTEAFLTNLS IQSKQEETVESKE | 
RTLKRQL<3EDQENNI*QDNQTSNS S PCRSNVGKGNIDGNVS CSEN 
LVANTARSPQFINIJaU3PRQAAGP^QPVTTSESKDGDSCRNGEK 
HMLPGI^SHNKEHLTEQINVEEKIiCSAEXNSCVQQSDNLKVAQNS 
PSVENI QTS QAEQAKPIjQEDIIiMQN I ETVHPFRRGSAVATSHFE 
VGNTCPSEFPS KS ITFTSRSTS PRTSTNFSPMRPQQPNLQHUCS 
SPPGFPFPGPPNFPPQSMFGFPPHLPPPLLPPPGFG\ FA\QNPM 
VPWPPW\HI>P\GQPQRMMGPLSQASRYIGPQNFYQVKDIRRPE j 
RRHSDPWGRQDQQQLDRPFNRG KGDRQRFYS DS HHLKRERHEKE 
WEQESERHRRRDRSQDKDRPRKSREEGHKDKERARLSHGDRGTD ! 
GKASRDSRNVDKKPDKPKSEDYEKDKEREKS KHREGEKDRDRYH 
KDRDHTDRTKSKR • | 


5952 j 


3226 


639 


PPARRSARDIJRAl>SMEAARPSGSWNGAIiCRliL\LVTI,\AFIjIF 
ASDACKNVTLHVPSK1J5AEKLVGRVNI,KECFTAANLIHSSDPDF | 
QILEIX3SVYTTNTIIjI*SSEKRSFTII^SNTBNQEKKKI FVFIjEH 
QTKVLKKRHTKEKVLRRAKRR WAP I PCSMLENSLGPFPLFXQQV 
QSDTAQNYTI YYS IRGPGVDQEPRNI*FYVERDTGNLYCTRP VDR 
EQYESFEIXAFATTPDGYTPEIjPLPL I IKIEDENDNYP I FTEET 
YTFTI FENCRVGTTVGQVCATDKDEPDTMHTRLKYS I IGQVPPS 
PTLFSMHPTTGVITTrSSQLDRELIDKYQIiKIKVQDMDGQYFGI* 
QTTSTCI INI DDVNDHX. PT FTRTS YVTS VEENTVD VE I LRVTVE 
DKDLVNTANWRAI^TILKGNENGNFKIVTDAKTNEGVLCVVKPL 
NYEEKQQMIIiQIGWNEAPFSREASPRSAMSTATVTVNVEDQDE 
GPECNPPIQTVRMKENAEVGTTSNGYKAYDPETRSSSGIRYKKL j 
TDPTGWVTIDENTGSIKVFRSI>DREAETIKNGIYNITVIiASDOG 
GRTCTGTLiGI ILQDVNDNS P F I PKKTVI ICKPTMSSAEIVAVDP I 
DEPIHGPProFSLESSTSEVQRMWRLKAINDTAARLSYQKDPPF 
GSYWPITVRDRIjGMSSVTSLDVTLCDCITENDCTHRVDPRIGG I 
ouv^iAj^nrt l I tMJiJjijoX/iljf r v_ J. J_»r 1 ij v v^o/^c>ljl oix-Ufivv j. ruu | 
LA0X3NLIVS1TIEAPGDDKVYSANGFTTQTVGASAQGVCGTVGSG 

iknggqetiewkgghc/tsescrgaghhhtldscrgg:-4tevdmc j 

R YTYSE WHSFTQPRLGEES IRGHTL IKW 




5353 j 


330 


811 


PLLCNPDPGWYWWVKQESE I S KESQEMDARPKJuDIiGFKEGQTI K 
LC IGN I TNKKGGAS KPRTARGG G LS LL P P PPGG KVT I P PPSS / V j 
KLPSTNHVTPPS I PKSNHGGSDADITJ ,DLD5 PAP VTTPAPT P VS I 
VSNDLWGDFSTASSSVPNQAPQPSNWVQF j 




5954 j 


32 


2130 


PPPPPPKXANMADLEAVLADVSYLM^ 

PEPSIRSVMQKYIJ^P^EITFDKIFNQKIGFLJLFKDFCLNEINE 1 
AVPQVKF YE2 1 KEYEKLDNBEDRLCRS R Q I YDA Y I MKELLS CSH | 
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seq 

ID 

NO: 



Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



Amino acid segment containing signal peptide 
(A= Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F- Phenylalanine, G=Glycine, 
H=Histidine, I=lsoleucine, K= Lysine, 
L= Leucine, M=Methionine, N=Asparagine, 
P= Proline , Q=Glutamine, R=Arginine, 
S-Serine, T=Threonine, V= Valine, 
W=Tryptophan, Y^Tyrosine, X=-UnJcnown , *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 



PFS KQAVEHVQSKLS KKQVTSTXiFQP Y I E E I CES IiRGD I FQKFM 
ESDKFTRPCQWKNVELNIHLTMNEFSVHRI IGRGGFGEVYGCRK 
ADTGKMYAMKCI^KKRIK>IKQGET1JU^RIMLSL.VSTGDCPFI 
VCMTYAFHTPDKLCFII^l^GGDLHYHLSQHGVFSBKEMRFYA 
TBI ILGLEHMHN^FVVYRDLKPANILLDEHGHARIS \DLGLACD 
FS KKKPHAS VGTHGYMAP E VLQKGTAYBSS ADWFSLGCMLFKLL 
RGKS P FRQHKTKDKHE XDRMTLTVNVELPDTFS P ELKS LLEGLL 
CRDVS KRLGCHGGGSQEVKEHSFFKGVDWQH VYLQKYPPPLI PP 
RGEVNAADAFDIGSFDEEDTKG iklldctoelyknfplvi SERW 
QQEVTETVYEAVNADTDKI EARKRAKNKQLGHEEDYALGKDCIM 
HG YMLKLGN P FLTQWQRR Y FYL FPNRLEWRGEGE SRQNLLTMEQ 
ILSVEETQIKDKKCILFRIKGGKQFVLQCESDPEFVQWKKELNE 
TFKEAQRLLRRAP KFLNKPRSGTVELP KPS LCHRNSNGL 



5955 



1726 



444 



KREREFRLAVCPLRYPSAYESSPGTELRECGLCRSGQEFADCRR 
PANRQDVLSGWINLPVLQLTKDPLKTPGRLDHGTRTAF I HHREQ 
VWKRCINIWRDVGLFGVIJSIEIANSEEEVFEWVKTASGWA1JUL,CR 
WASSLHGSLFPHLSLRSEDLIAEFAQVTNWSSCCLRVPAVJHPHT 
NKFAVALLDDSVRVYNAS STI VPSLKHRLQRNVASLAWKPLSAS 
VLAVACQSC IL I WTLD PTSLSTRPSSG CAQ VLSHPGHT P VTS LA 
WAPSGGRLLSAS PVDAAI RVWDVS TETCVP LPW FRGGG VTNLL W 
SPDGSKIIATTPSAVFRVWEAQMWTCER^TIiSGRCQTGCWS PD 
GSRLLFTVLGE PLI YSLS F PERCGEGKG\ ALEVQS QQRLWQ I CIi 
RGX2YRHQMVRRGLGERLTPWSGTPVGWVWLCL 



5956 



1705 



139 



GVGVRGARAMATVQEKAAALNLSALHSPAHRPPGFSVAQKPFGA 
TYVWSS I INTLQTQVBNTKKRRHRLKRHNIX^FVGSEAVDVI FSHL 
I QNKYFGDVDI PRAKWRVCQALMDY KVFEAVPTKVFGKDKKPT 
FEDSSCSLYRFTTI PNQDSQLGKENKLYSPARYADALFKSSD IR 
S ASLEDL WENLSLKP ANS PHVNI SATLS PQV I NEVWQEET I GRL 
LQLVDLPLLD5 LLKQQEAVP KI PQPKRQSTMVNS SNYLDRG ILK 
AYSDS QEDEWLSAA IDCSE YLPDQMWEI SRS FPEQPDRTDLVK 
ELLFDAIGRYYS S REPLLNHLS DVHNG IAELL VNG KTE IALEAT 
QLLLKLLDFXJNREEFRRl^YFMAVAANPSEFKI^QKESDNRMVVK 
RIFSKAI VDNKNLSXGKTDLLVLFL\MDHQKDVFKI PGTL\HKI 
VS \ VK \ LMAI QNGRDPNRDAG YI YCQRIDQRD YSNNTEKTTKDE 
LLNLLKTLDE3DS KLSAKEKkK\LLGQFYKCHPDI FIEHFGD 



5957 



1479 



451 



E LQ VAVAMDTLDR WKPKT KRAKRFIxE KRE PKLN EN I KNAMLI K 

GGNANATVTKVLKDVYALKKPYGVLYKKKNITRPFEDQTSLEFF 

SKKSDCSLFMFGSHNKJG^Pl^LVIGRMYDYHVLDMIEIiGIENFV 

SLKDIKKSKCPEGTKPMLIFAGDDFDVTEDYRRLKSLLIDFFRG 

PTVSNI RLAGLE YVLHFTALNGKI YFRS YKLLLKJCSGCRT PRIE 

LEEMGPSLDLVLRRTHLASDDLYKLSMKMPK^ 

T FGTTYGR I HMQKQDLS KLQTRKM \ KGLKKRPAER I TEDHEKKS 

KRI KKKLMELSQ PLLFHCVLLKRI IKHQS IQSFL 



5958 



3138 



AAALGMLLWFPACQAFNLDVEKLTVYSGPKGSYFGYAVDFHI PD 
ARTAS VLVGAPKANT S Q PD I VEGGAVY YCP W PAEGS AQCRQ I P F 
DTTNNRKI RVNGT KEP I EFKSNQWFG\ATVKA.\HKG KSCGP VAP 
LLFTWRNFLKPTPEKGPVGTCYVAIQNFSAYAEFS PCGNSNADP 
EGQGYCQAGFSLDPYKNGDL1VGGPGSFYWQGQVITASVAD I IA 
NYSFKDILRKLAGEKQTEVAPAS YDDS YLGYSVAAGEFTGDSQQ 
ELVAG I PRGAQNFGYVS I XNSYDMTF1QNFTGEQMASYFGYTW 
VS DVNSDG LDD VLVGAPL FMERE FESNP R E VGQ I YL YLQ VS S LL 
FRDPQILTGTETFGRFGSAMAHLGDLNQDGYNDIAI GVPFAGKD 
QRGKVLIYNGNKDGLNTKPFPKFCQGVWASHAVPSGFGFTLRGD 
SDIDK^YPDLrVGAFGTGKVAVYRARPVVTVDAQLLLHPMI IN 
LENKTCQVPDSMTSAACFSLRVCASVTGQS IANT IVLMAE VQLD 
S LKQKGAI KRTLFLDNHQAHRVFPLVIKRQKSHQCQDF I VYLRD 
ETEFRDKLS P INISLNYS LDESTFKEGLEVKP I LNYYRENIVS E 
QAHI LVDCGEDNLCVPDLKLSARPDKHOV I IGDENHLMLI INAR 
NEGEGAYEAELFVMI PEEADYVGIERNNKGFRPLSCEYKMENVT 
RMWCDLGNPMVSGTNYS LGLR FAV PRLBKTNMS INFDLQ I RS S 
N KDNPDSNFVS LQINITAVAQVE IRGVSHPPQ I VLP IHNWEPEE 



412 



ouerwiO -WD 



0153312A1 I > 



WO 01/53312 PCT/USOO/34263 



SEC- 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A«Alanine, O Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G^Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L= Leucine . M=Methionine , N=Asparagine , 
P=Proline. O— Glut amine . R=Arciinine. 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y« Tyro sine, X=Unknown, *«Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








EPHXEEEVGPIiVEHIYELHNIGPSTISDTIIiEVGWPFSARDEFLi 
LY I FHIQTLGPIjQCQ PNPNINPQD I KPAAS PEDTPELSAFLiRNS 
TIPHLVRKRDVHVVEFHRQSPAiai^CTNIECLQISCAVGRLEG 
GESAVLKVRS RLWAHTFLQRKNDP YALASI»VS FEVKKMP YTDQP 
AKLPEGS IAI KTSV 1 WATPNVS FS I PLWVT I LAI I*DG1*LVIAI L» 
TLALWKCGFFDRARPPQEDMTDREQLiTNDKTPEA 


595S 


1 


1166 


GTSGYAAQQLPSIiliKEREFHIjGTLNKVYASQHLNHRQVVCGTKC 
NTLFWD VQTS Q ITKI P I LKDREPGGVTQQGCG IHAI ELNPSRT 
I^TGGDNPNSI^YRLPT1^PVCVGI5DGHKDWIFSIAX?ISDTM 
AVSGSRIX5SMGLWEVTDDVLTKSDARHNVSRVPVYAHITHKALK 
DIPKSDTNPDNCKVRAI^FNNKNKELGAVSIJJGYFHLWKAENT^ 
SKLI>STKLPYCRENVCLAYGSEWSVYAVGSQAHVSFLDPRQPSY 
NVKSVCSRBRGSGIRSVSFYEHI ITVGTGQGSLLFYDIRAQRFIj 
EERliSACYGSKPRIJW?Em>KLTTG\KGWLNHDETWRNYFSDIDF 
FPNAVYTHCYDSSGTKLFVAGGPLPSGLHGNYAGLWS 


5960 


2853 


870 


fvwsdggprprrgpavgagaaklsdpwamtpgtanratnplnke 
ldwas ingfceqlnedfegpplatrllahki qs pqeweai qalt 
vletcmkscgkrphdevgkfrflnelikvvspkyi/5srtsekvk 
nki 1»el1*ys wtvglpee vkiaeayqmlkkqg \ ivksdpklpddt 

TFPLPPPRPKNVIFEDEEKSKMIiARLLKSSHPEDUW^NKIilKE 
FCVQEDQKKKEKXSKKVItA 

SSEDI>\MKEL\YQRCERmPTLFPTGRVDTEDlTO\EAIiAEILQA 
NDNIiTQVINIiYKQLVRGEEVNGDATAGS I PGSTS ALLDLSGI*DL 
PPAGTTYPAMPTRPGEQASPEQPSASVSLLDDELMSLGLSDPTP 
PSGPSIiDGTGWNSFQSSDATEPPAPAIAQAPSMESRFPAQTSLP 
ASSGLDDLDI^GKTLI^SLPPESQQVKWEKQQPTPRLTLRDLQ 
NKSSSCSSPSSSATSIiLHTVSPEPPRPPQQPVPTELSLAS IT VP 
LESIKPSNILPVTVYDQHGFRIIiFHFARDPLPGRSDVLWWSM 
LSTAPQP1RN1VTQSAVPKVMKVKLQPPSGTEI.PAFNPIVHPSA 
ITQVLLDANPQKEICVRIJIYICLTFTMGDQTYNEMGDVI^FPPPET 

WGSL 


5961 


198 


3147 


SGEPRPEPGNMATCIGEKIEDFKVGNLDGKGS FAGVYRAES IHT 
GLEVAI KMIDKKAMYKAGMVQRVQNEVKIHCQLKHPSILELYNY 
FEDSNYVYLVLEMC^GE^INRYLKNRVKPFSENEARHFMHQI IT 
GML Y LHS HG I LHRD LTLSNLLLTRN MN I KIAD FG LATQL KM PHE 
KHYTLCGTPNYI S PEIATRS AHGIjES DVWS LG CMFYTIiLI GRP P 
FDTDTVKNTI^nCVVI^YEMPTFLSIE 

SLSSVLDHPFMSRNSSTKSKDLGTVEDSIDSGHATISTAITASS 
STSI SGSLFDKRRLLIGQPLPNKMTVFPKNKS S TDFSSSGDGNS 
FYTQWGNQETSNSGRGRV I QDAEERPHSRYXRRAYSS DRSGTSN 
SQSQAiOTTMERCHSAEMLSVSKRSGGGENEERYSPTDNNANIF 
NFFKEKTSSSSGSFERPDNNQALSNHLCPGKTPFPFADPTPQTE 
TVG^WFGNLQIHAHLRKTTEYDS ISPNRBFQGHPPLQKDTSKNA 
WTryTKVKKN^DASDNAH^VKOONTMlCiri^ALHSKPEIIGOECV ' 
GSDPLSEQSKTRGMSPPWGYQNRTLRSITSPLVAHRLKPIRQKT 
KKAWS I LiDSEEVCVELVKE YASQE YVKE VLQ I S SDGNTI TI YY 
PNGG\RGFPIA\DRPPSPT\DNISR\YSF\DNLPEKYWRKYQYA 
SRFVQLVRSKS PKI TYFTRYAKCILMENS PGADFEVWFYDGVKI 
HKTEDFIQVIEKTGKSYTLKSESEVNSIiKEEIKMYMDHAPIEGHR 
ICLALES 1 1 SEEERKTRSAPFFPI I IGRKPGSTSSPKALSPPPS 
VDSNY PTRDRAS FNRMVMHSAAS PTQAP I LNPS M VTN3GLGI.TT 
TASGTD I S SNSL KDCLPKSAQLLKSVF VKNVGWATQ\ DTSGAVW 
VQFNDGSQLWQAGVSS ISYTSPNGQ\ TTR\ YGENEKbPDYI KQ 
KLQCLSS ILLMFSNPTPNFH 


5962 


20 


2447 


R VCSS S AS TASQAVMAD AWE E IRRLAAD FQ RAQ FAEATQRLS E R 
NCIEIVNKLIAQKQLEVVHTLIX5KEYITPAQISKETlRDEIirVRG 
GRVNI VDLQQVINVDLIHIENRIGDI I KSEKHVQLVU5QLIDEN 
YX,DRIAEEVNDKIiQESGQVTI SEIiCKTYDLPGNFIiTQALTQRLG 
HI I5GH I DLDNRGVI FTEAFVARHKAR I RGLFSAITRPTAVNS L 
ISKYGFQEQLLYSVLEELVNSGRLRGTVVGGRQDKAVFVPDIYS 
RTQSTWVDS FFRQNG YLEFDALS RLGI PDAVS Y I KKRYKTTQI*I» 
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SEQ 
ID 

HO: 


Predicted 
beginning 
nucl eot ide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corre spa nding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F« Phenyl alanine, G^Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L^Leucine, M=Methionine , N^Asparagine . 
P=Proline, Q^Glutamine, R=Arginine, 
S=Serine, T= Threonine, V= Valine, 
W=Tryptophan, Y= Tyrosine, X=> Unknown, *=*Stop 
Codon, /=possible nucleotide deletion, 
\=>pos3xble nucleotide insertion) 








PLKAACVGQGIiVDQVEASVEEAI SSGTWVDIAPLLPTS I>S VEDA 
AI LLQOVMRAFS KOASTVVFSDTVVVSEKF\ INDCTEli F RELMH 
QKAEKEMKWPVHLITEEDLKQISTLESVSTSKKDKKDERRRKA 
TEGSGS MRGGGGGNARE YKI KKVKKKGRKDDDSDDESQS S HTGK 
KKPEISFMFQDEIEDFLRiOlIQDAPEEFISELAEYXIKPI^KTY 
LBWRSVFMS S TT SASGTGRKRT I KDLQEE VSNL YNN I RL F E KG 
MKFFADDTQAAI*TKHLLB^ VCTDITNIiIFNFIAS DLMMAVDDPA 
AITSE IRKKI LSKI*SEETKVALTKUJNSIjNEECS I EDFI S CLDSA 
AEACDIMVKRGDKKRERQ ILFQHRQAIiAEQLKVTEDPAL ILHLT 
SVLLFQ FSTHSMLHAPGRCVPQI IAFLNSKI P EDQHALLVKYQG 
LWKQLiVSQS KKTGQG DY PLNNELDK3QEDVAS TTRKELQELSS 
S I KDLVLKSRKSS VTEE 


5963 


62 


1130 

* 


PWNPQDFPGNRGI^K3\QKGE IGP P \GQQGKKGAPGMP \GLMGSN 
GSPGQPGTPGSKGSKGEPG IQGMPGASGIiKGEPGATGS PGEPG Y 
MGLPGI QGKKGDKGNQGEKGIQGQKGENGRQGI PGQQG IQGHHG 
AKGERGBKGEPGVRGAIGS KGESGVDGLMG PAGPKGQPGDPGPQ 
GPPGLDGKPGREFSEQFIRQVCTDVIRAQIiPVLIiQSGRIRNCDH 
CLS QHG S PG I PG P PGP I G PEGPRGLPGLPGRDGVPGI* VGVP GRP 
GVRGLKGLPGRNGE KGSQG FGYPGEOGP PGPPGPEGPPG I S KEG 
PPGDPGLPGKDGDHGKPGIQGQPGPPG I CDPSLCFS VIARRDP F 
RKGPNY 


- 5964' 


r 3 


2147 


SCKTRGRIjSPLQPREAGSSRGSRARSEPPRPGGMEEACQVQTTK 
RGDPHELRNI FLQ YASTE VDGERYMTPEDFVQRYLGI*YNDPNSN 
PKI VQliIiAGVADQTKDGL I S YQEFLAFES VLCAPDSMF I VAFQL 
FDKSGNGE VTFENVKEI FGQTI IHHHIPFNWDCEFIRDHFGHNR 
KKHLKYTEFTQFLQELQI»EHARQAFAliKDKSKSGMISGI*DFSDI 
MVTIRSHMIiTPFVEENLVSAAGGSISHQVSFSYFNAFNSLLNNM 
ELVRKIYSTLAGTRKDAEVTKEEFAQSAIRYGQATPIjEIDII^YQ 
LADLYNASGRIiTIiADIERIAPLAEGALPYISrL^LQRQQSPGLGR 
PIWLQIAESAYRFTLGSVAGAVGATAVYPIDLVKTRMQNQRGSG 
SVVGBIjWxTCNSF13CFlCKVI»RYEGFFGIiYRGLlPQLIGVAPEICAI 
KLTVKD FVRDKFTRRDGSVP LPAEVLAGGCAGGS QV I FTNPLE I 
VKIRLQVAGEITTGPRVSAIjNVLRDLGI FG LYKGAKACFLRD I P 
F5^IYFPVYAHCKIiLLADENGHVGGI^IJLJUVGAMAG\VPAASL,V 
TPADVI KTRLQVAARAGQTTYSGVIDCFRKIL\REEGPSAFWKG 

taarvfrsspqfgXvtlvtyellqrgfyidfgglkpagseptpk 

SRIADLPPANPDH IGGYRLATATFAGI ENKFGLYLPKFKSPSVA 
WQPKAAVAATQ 


5965 


1 


1498 


MVTWL YRFI»PTSNMAAK1iRSIj1»P PDLRLQFWLHARLQKCtXiSRG 
CGS YCAGAKAS PLPGKMAMGLMCGRRELliRLLQSGRRVHSVAGP 
SQWLG}0>LTTRIiLFPAAPCCCKPHYLFI*AASGPRSLSTSAIS FA 
EVQVQAPPWAATPS PTAVPEVASGETADWQTAAEQS FAELGL 
GSYTPVGLIQNIJ^FMHVDLGLPWWGAIAACTVFARCLI FPL IV 
TGQRFJWUlIIINHLPEIQKFSSRIREAKIxAGDHIEYYKASSEMAL 
YQXKHGIKLYKPLILPVTQAPIFISFFIALRBMANLPVPSLQTG 
GLWHPQD1»TVSDP I Y IL PLAVTATMWAVLELGAETG VQSS DLQW 
MRlWIRMMPI,ITLPITMHFPTAVFMYWI,SSNLFSIfVQVSCLRlP 
AVRTVLKI PQRWHDLDKLPPREGFLES FKKGWKNAEMTRQLRE 
REQRMRNQI*EI»AARGPLROTFTHNPIjLQPGKDNPPNIPSS\SSS 
SSKPKSKYPWHDTLG 


5966 


102 


1925 


RSKQVMARLTKRRQADTKAIQHLWAAIE I IRNQKQ iani dritk 

ymsrvhgmhpkettrqi>slavkdglivetltvgckgskag^ 
gywlpgdeidwetenhd^cfechlpgevli cdlcfrvyhs kci* 
spe frlrds s s pwqcpvcrs i kkkntnkqemgtylirfi vsrmke 

RAIDLNKKGKDNKHPWxRRI»VHSAVD^ 

PKADAQLLLHNTVI FYGADSEO^IARMLYIO>TCHEI»\DE1jQLC 
KNCF YLANARPDNWFCYPCI PNHELDWAKM KG FGFWP AKVMQKE 
DNQVD VRF FGHHHQRAW IPS ENI QD I TVN I HRLHVKRSMG W KKA 
CDELELHQRFliREGRFWKS KNEDRGEEEABSS I S STSNEQLKVT 
QEPRAKKGRRNQS VEPKKEEPEPETEAVSSSQEI PTMPQPI EKV 
SVSTOTKlCI^ASSPR>tLHRSTOrim^ FNDF 
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SEQ 
ID 
NO: 


j Predicted 
1 beginning 

nucleotide 
1 location 
I corresponding 
I to first 
I amino acid 
I residue of 
i amino acid 
1 sequence 


Predicted end 
nucleotide 
location 
cor re sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide ~1 
CA^Alanine, C= Cysteine, D=Aspartic Acid, E= j 
Glutamic Acid, F=Phenyl alanine, G=Glvcine 
K=>Histidine, I=Isoleucine, K=I>ysine, 
L=Leucine, M=Methionine, N=Asparagine, j 
P^Proline, Q^Glut amine, R=Arginine, 
S^Serine, T= Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X= Unknown, *«stop 
Codon, /=possible nucleotide deletion, 1 
\=possible nucleotide insertion) | 








KDRMKSDHKRSTERVVREALEKLRSEMEEEK^^ 1 
EMDRKCKQVKE KCKEEFVEEI KKLATQHKQL I SQTKKKQWCYNC 
EEEAMYHCCWNTSYCS I KCQQEHWHAEHKRTCRRKR j 


5967 


j 102 


1925 


RS KQVWARLTKR RQADT KAIQHL WAAI E 1 1 RNQKQ IAN IDRITK 
YMSR VHGMHPKETTRQLSLAVKIX5LI VETLTVGCKGS KAG I EQE 
G YWLPGDE I DWETENH DWYCFECHLPGEVL I CDLCFR VYHS KCL 
SDEFRLRDSSSPWQCPVCP^IKKKNTNKOEMGTYTjRTIVSRM^ 1 
RAIDLNKKGICDNKHPMYRRI,VHSAVDVPTIQEiCVNEGKyRSYEE j 
FKADAQt>LI>HNTVT FYGADSEQADI ARMLYKDTCHEI»\DEIjQLC 
KNCFYLANARPDNWFCYPCIPNHELDWAKMKGFGFVIPAKVMQKE 
DNQ VDVRFFGHHHQRA W I P SEN I QD I TVN IHRLHVKRS MGWKKA 
CDELELHQRFLREGRFWKS KNEDRGEEEAESS I SSTSNEQLKVT j 
QEPRAKKGRRNQSVEPKKEEPEPETEAVSSSQEI PTMPQP IEKV 
SVS TQTKKLSASS PRMLHRSTQTTNDGVCQS MCHDKYTKI FNDF 1 
KDRNKS DHKRETER VVREAL EKLRS EMEE EKRQA VNKAVANMQG 
EMDRKCKQVKEKCKEE FVEE I KKLATQHKQL ISQTKKKQWCYNC 
EEEAMYHCCWlfrSYCSIKCXXDEHWHAEHKRTCRRKR 1 


5968 


81 


1288 


VRFPRRGGAPPTVIiTPGRQOGVFIiGPQRPGSEPDI PARGQPHPP 
RPVGVSTSAQAQVQPPA^O^RRRliAI/;l/3FCIJLACTSbSVIiWVYIl 
ENWLPVSYVPYYI^PCPEIFNMKXHYKREK^LQPVVWSQYPQPKIj 
LEHRPTQLLTLTPWLAPIVSEGTFNPEIiIjQHIYQPLOTjTIGVTV 
favgn/hflesaeeffkrgyrvhyyi ftdnpaavpgvpixgphrr* 
LSSI P I QGHSHWEETSMRRMBTISQHIAKRAEREVDYLFCLDVD 
MVFRNPWGPETLGDLVAAI H PS YYAVPRQQFP YER RR VSTAFVA 
DS EGD FYYGGAVFGGQ VARVYEFTRG CHMA I LAD KANG I MAAWR j 
ESSHI^RJnTCSNKPSKVJ^PEYLWDDRXPQPPSLKLIRFSTI*DK 
DISCLRS j 


5969 


1126 


533 


DVGFNI KRKRCDLDVFIiESPRKPSGRRDRAPEKQRRIAANKCXiC 1 
TGVREGEPPS /TTSQKVKEAGRDFTYLI WLFGI S I TGGLFYT I 
FKELFSSSS PSK1 YGRALEKCRSHPEVZGVFGESVKG YGEVTRR 
GRRQHVRFTEYVKDGLKHTCVKFYTEGSE pgkqgt vyaqvkenp j 
GSGEYDFRYIFVEIESYPRRTI I IEDNRSQDD j 


5970 


316 j 


4712 

• 


SQDNIGHRIiLQKHGWKliGQGLGKSLOGRTDPIPIVVKYDVMGMG j 
RMEMEtiDYAEDATERJUlVLEVEKEDTEELRQKYKDYVDKEKAIA 
KALEDLRANF YCELCDKQ YQKHQE FDNH I NS YDHAHKQRLtKDI>K I 
QREFARNVSSRSRKDEKKQEKALRRLHEliAEQRKQAECAPGSGP 
MFKPTTVAVDEEGGEDDKDESATNSGTGATAS CGLGS E FS TDKG 
GPFTAVQI TNTTGLAQAPGLAS QG I S FG I KNNLGTPLQKLGVS F 
SFAKKAPVKLESIASVFKDHAEEGTSEDGTKPDEKSSDQGLQKV 
GDSDGSSNLDGKKEDEDPQDGGSLASTI*S KL-KRMKREEGAGATE I 
PEYYHYIPPAHCKVKPNFPFIXFMRASEQMDGDNTTHPKNAPES 
KKGSSPKPKSCIKAAASQGAEKTVSEVSEQPKETSMTEPSSPGS | 
KAEAKKAI^GDVSDQSLESHSQKVSETQMCESNSSKETSIATPA 
GKESQEGPKHPTGPFFPVLSKDESTALQWPSELLIFTKAEPSIS 
YS CNPLYFDFKLSRNKDARTKGTEKPKDIGSSSKDHIjOGLDPGE 
PNKSKEVGGEKIVRSSGGRMDAPASGSACSGIiNKQEPGGSHGSE 
TEDTGRSLPS KKERSGKSHRHKXKKKHKKSS KHKRKHKADTEEK 
SSKAESGEKSKKRKKRKRKKNKSSAPADSERGPKPEPPGSGSPA 
PPRRRRRAQDDSQRRSLPAEEGS SGKKDEGGGGSS SQDHGGRKH 
KGELPPSSCQRRAGTKRSSRSSHRSQPSSGDEDSDDASSHRiHQ 
KSPSQYS EEE EEEDSGSEHSRSRSRSGRRHSSHRS SRRS YSS SS 
DAS SDQS CYS RQRS YSDDS YSD YSDRSRRHS KRSHDSDDS D YAS 
S KHRSKRHKYSSSDDD YSLSCSQSRSRS RSHTRERSRSRGRSRS 
SSCSRSRSKRRSRSTTAHSWQRSRSYSRDRSRSTRSPSQRSGSR 
KRSWGHESPEERHSGRRDFIRSKIYRSQSPHYFRSGRGEGPGKK 
DDGRGDDSKATGPPSQNSNIGTGRGSEGDCSPBDKNSVTAKLIiD 
EKIQSRKVERKPSVS EEVQATPNKAGPKLKDPPQGYFGPKLPPS 
LGNKPVLPLXGKLPATRKPNKKCEESGliERGEBQEQS ETEEG PP 
GSSDALFGHQFPXSEETTGPLLDPPPEESKSGEVTADHPVAPLG 
PPAHFDCYTjGDPTISHNYI»PDPSDGNTLESI*DSSSOPGPVESSIj 
LPIAPDLEHFPSYAPPSGDPS I ESTDGAEDA\ SLAPLESQ P I TF 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=»Aspartic Acid, E- 
Glutamic Acid, F=» Phenyl alanine, G-Glycine, 
H=Histidine, I=Isoleucine , K=Lysine, 
L=Leucine, M=Methionine , N=Asparagine, 
P=Proline, Q=Glut amine, R=Arginine, 
S=Serine, T=Threonine , V=Valine, 
K=Tryptophan , Y=Tyrosine, X -Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








TPEEMEKYSKLQQAAQQH2 QQQIiIiAKQVKAFPAS AALAPATPAI* 
QPIHIQQPATASATS ITTVQHAILQHHAAAAAAAIGI H PHPHPQ 
PLAQVHHI PQPHLTPI SLSHLTTHSI I PGHPATFLASHPIHI IPA 
SAIHPGPFTFHPVPHAALY PTIjLAPRPAAAAATALHLHPLLHP I 
FSGQDLQHP PSHGT 


5971 


53 


2149 


SFL»Y FVGVDMDNP IGNWDGRFDGVQLCS FACVESTI LIiHINDI I 
PESVTQERRPPKLAFMSRGVGDKGSSSHNKPKATGSTSDPGNRN 
RSSLFYTLNGSSVDSQPQSKSKNTWY IDEVAEDPAKSLTE I STD 
FDRSSPPLQPPPVNSIiTTENRFHSIjPFSLTKMPNTNGSIGHSPL 
SLSAQSVMEEI*NTAP VQES P PIiAM PPGNSHGLEVGSLAE VKENP 
PFYGVIRWIGQPPGLNEVIAGI^ELEDECAG\CTDGTF/REGTRY 
FTCAIJCKALFVKIJCSCRPDSRFASIiQPVSNQIERCNSliAIWEAY 
LSEVVEENTPTQKWEKEGLEIMIG\KKKGIQGHYNSCYUDSTLF 
CLFAFSSVLDTVLLRPKEKNDVEYYSETQEIiLRTEl VNPLR I YG 
WCATKIMKLRKILEKVBAASGFTSEEKDPEEFIjNILFHHIIiRV 
BPLIiKXRSAGQKVQDCYFYQI FMEKNEKVGVPTIQQLLEWS FIN 
SNUCFAEAPSCLI IQMPRFGKDFKLFKKI FPSLELN ITDLLEDT 
PRQCRICGGLkAMYECRECYDDPDISAGKI KQFCKTCNTQVHL.HP 
KRLNHKYN P VS LPKDLPDWDWRHGCI PCQNMEL FAVLCI ETSHY 
VAFVKYGKDDSAWLFFDSMADRBGGQNGFNIPQVTPCPEVGEYIi 
KMSLED^SLDSRRIG^CARRLLCnAIYVPCTQSPTMSLYK 


5972 


440 


1761 


ILLAGSPSPRDQCSQRQSSGGDKEIiVTRGCTFSTAWSPSAMTQ 
EPFREEDAYDRMPTLERGRQDPASYAPDAKPSDLQLSKRLPPCF 
SHKTWVFSVI»MGSCnjI*VTSGFSIiYljGNVFPAEMDYLRCAAGSCI 
PSAIVS FTVSRRNANVI PNFQILFVSTFAVTTTCLIWFGCKLVI* 
NPSAININFNLI LLLLJjELLMAATVI IAARSSEEDCKKKKGSMS 
DSANILDEVPFPARVI*KSYS WEVIAGISAVLGGI IALNVDDSV 
SGPH1>SVTFFWILVACFPS AIASHVAAECPNKCLVEVL IAI SSL 
TSPI»I*FTASG YLS FS I MRI VEMFKDY ? PAI KPS YDVLI*I>I>I*LLV 
IiLIiQA/GPQHGHRHPVRALQGQCKAAGCIIiGHPERPAGAPGWGG 
GQEPPEGVRQGESLESRRGANGPVTPRRG^mVAAPSLAPGMETH 
NP 


5973 


65 


. 2007 


NGDGKDLFGHI WAWRSNGI ISNFRRS PHAGMAEDEPDAKS PKTG 
GRAP PGGAEAGE PTTLIjQRLRGT I SKAVQNKVEG I LQDVQKFS D 
NDKLYI>YXQLPSGPTTGDKS SEPSTLSNEEYMYAYRWIRNHIiEE 
HTDTCLPKQSVYDAYRKYCESLiACCRPLSTAN FGKI IRE IFPDI 
KARRLGGRGQSKYCYSGIRRKTLVSMPPDFGLDLKGSESPEMGP 
EVTPAPRDELVEAACAIjTCDWAERI LKRS FSS I VEVARFLLQQH 
LI SARS AHAHVLKAMGLAE EDEHAP RERS S KPKNGL. EN PEGG AH 
KKPERIiAQPPKDLEARTGAG PLARGERKKS VVES SAPGANNI*QV 
NALVARLPLIiLPRAPRSLI PPI PVSPPIIiAPRLSSGALKVATLP 
LSSRAGAPPAAVP 1 INMILPTVPALPGPGPGPGRAPPGGLTQPR 
GTENREVG IGGDQGPHDKGVKRTAEVPVS E ASGQAP PAKAAKQD 
IEDTASDAKRKRGRPLKKSGGSGERNSTPLKSAAAMESAQSSRIj 
PWETWGSGGEGNSAGGAERPGPMGEAEKGAVIiAQG\ QGDGTVSK 
GGRGPGSQHTKEAEDKIPLVPSKVSVIKGSRSQKEAFPIiAKGBV 
DTAPQGNKDLKEHVLQS SIiS QEH KDPKATP P | 


5974 


4293 


2200 


LGLQMHTTSGR IHQAM VTSLNEDNESVTVE WI ENGDTKGK\E ID 
LES I FSU6TP \ DI>\ VPDGEIEPSP \ETP P PPASS AKVNKIVKNRR 
TV \ AS IKNDPPS \ RDNRVVGS ARARPSQFPEQFSS AQQNGSV\S 
D I S P VQAAKKE FG PPSRi2KSNCVKEVEKIrQEKREKRRLQQQEIiR 
EKRAQDVDATNPNYE I MCMIRDFRGSLDYRPLTTADP IDEHRI C 
VCVRKRPIiNKKETQMKDLDVITI PSKD WMVHEPKQKVDLTRYL 
ENQT FRFD YAFDDS APNEMV Y R FTARPIj VET I FERGMATCFAYG 
QTGSGKTHTMGGDFSGKNQDCS KG I YAIAARDVFTiMLKKPNYKK 
LEI^VYATFFEIYSGKVFDIXNRKTKIJ?VI*EDGKQQVQVVGLOE 
REVKCVEDVLKLiIDIGNSCRTSGQTSANAHSSRSHAVFQI ILRR 
KGKLHGKFSLIDLAGI^ERGADTSSADRQTRLEGAEIWKSLLiALK 
EGIRAI/aOTCPHTPFRASKIiTQVIiRDSFIGENSRTCMIATISPG 
MASCENTLNTI>RYANRVTCEIjTVDPTAAGDVRPIMHHP 
LBTQWGV1GSS PQRDDLKLI^CEQNEEE^ \ 
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SEQ 

Tn 
x u 

NO: 


Predicted 
beoi nnino 

nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 

nurl OY-»t- i H q 

location 
corresponding 
to first 
amino acid 
residue oE 
amino acid 
sequence 


Amino acid segment containing sxgnal peptide 
fR=Alanine r^Cvstpine Ds AntAirt ic Acid. E= 
Glutamic Acid, F=Phenyl alanine, G=Glycine, 
H=Histidi_ne, I=Isoleucine, K=I*ysine, 
L=Ijeucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glut amine, R=Arginine , 
S=Serine, T=Threonine, V= Valine, 
K=Tryptophan, Y=Tyrosirte, x=Unknown, *=Stop 
Codon, /=possible nucleotide dele t ion , 
\=possible nucleotide insertion) 








eeqwedrravfqes i rwledekaixlemteevdyd vds yatqle 
ai leqkidilteLrdkvksfraalqeeeqaskqinpkrprai, 


597S 


4293 


2200 

■ 


LGIjQMHTTSGR ihqamvts lnednes vtvewi ENGDTKGK \ E I D 
LES1FS1jNP\DL\VPIX^IEPSP\ETPPPPASSAKVNKIVKNRR 
TV\ASI KNDPPS \ PJDNRWGSARARPS Q F PEQFS S AQQNGSV\ S 
D I S P VQ AAKKEFG P P S RRKSNCVKEVEKLQ E KRE KRRIjQQQEIiR 
EKRAQDVDATNPNYEIMCMIRD FRGSIiDYRPLTTADP IDEHRI C 

ENQTFRFD YA FDDSAPNEMVYR FTARPItVETI FERGMATCEAYG 

QTG 5G KTHTMGGD FS G KNQDCS KG I Y ALAARD VFLMli KKPNYKK 

LELQVYATFFE IYSGKVFni.T»NRKTKLRVLEDGKQQVQVVG%QE 

REVKC^EDVLKLIDIGNSCRTSGQTSANAHSSRSKAVFQIILRR 

KGKUIGKFSLIDLAGNERGADTSSADRQTRIxEG^ 

ECI RAI/3RNKPHTPPRASKLTQVLRDSFIGENSRTCMI ATIS PG 

MAS CBNTIJm^YAOTiVKELTVDPTAAGDVRP IMHHPPNQI \DD 

LETQWGVGSS PQRDDLKIJbCEQNEEEVS PQIxFTFHEAVS QMVEM 

EEQWEDHRAVFQES IRWDEDEKAT J rEMTEEVD YD VDS YATQLE 

AIIxEQKIDlLTEIxRDKVKSFRAALQEEEQASKQINPKRPRAL 


5976 


20 


2949 

> 


VHHLHLTRVSVWNLDI ILR1AQQMGIKTLNLVLG\LKRA\LBF 
PEVSWMEVKDPNMKGAI4LTNTGKYAI PTIDA\EAYAIGKKEKPP 
FLPEEPS SS SEEDDPI PDELLCLICKDIMTDAVVTPCCGNSYCD 
BCIRTALLESDEHTCPTCHQNDVS PDAL I AiTKFLRQAVNNFKNE 
TCYTKPJ^KQLPSPPPPIPPPRPLIQRNLQPLMRSPISRQQDPL 
MI PVTSS S THPAPS I S S LTSNQS S IiAPPVSGNPS SAPAPVPDI T 
ATVSISVHSEKSDGPFRDSDNKILPAAALASEHSKGTSSIAITA 
LMEEKG YQ VP VLGTPS LLGQSLLHGQXjX PTTGP VRINTARPGGG 
RPGWEHSNKLGYLVSP PQQIRRGERSCYRS INRGRHHSERSQRT 
CGPSIjPATPVFVPVPPPPIiYPPPPHTLPLPPGVPPPQFSPQFPP 

GQP \ ppagysvpppgfppapanlstpwvssgvqtahsntxpttq 

APPLSREEFYREQPJUiKEEEKKKSKIiDEFTNDFAKEXMEYKKIQ 

RSHSRS YSRS P P YPRRGRG KSRNYRSRSRSHG YHRS RS RSPPYR 
RYHSRS RS PQAFRGQS PNKRNVPOGETEREYFKRYREV P P P YDM 
KAYYGRSVDFRDPFEKERYREWERKYREWYBKYYKGYAAGAQPR 
PSANRENFS PERFT.PLNIRNS PFTRGRREDYVGGQSHRSRNIGS 
NYPEKLSARDGHNQKDNTKSKEKESENAPGI^KGNKHKKHRKRR 
KGEES EGFXNPELLETSPJCSREPTGVESNKTDSLFVIxPSRDDAT 
PVRDEPMDAES ITFKSVSEKDKRERDKPKAKGDKTKRKNDGSAV 
cxrtrPMT^TVPTi vrz pr>i?TT\mn V nxrpnT .T .rvr .NT ,\ OTiKKPKRHTPKDL 
TnamHLPIxRRMIO*SL\EPP\EKI/ra^ 
EGI»FQRCQ IRKANN 


5977 


1363 


1336 


FI^DRGQVLSHFQCLSLHSINHILHPGAGVAAGPATGW/REYLT 
PVLKES KFKETGVI TPEEPVAAGDHLVHHCPTWQWA TCEJELKVK 
AYLPTGKQFLVTKNVPCYKRCKQMEYSDELEAI IEEDDGDGGWV 
DTYHNTGITGITEAVKEITLENKDNIRLQDCSAIiCEEEEDEDEG 
EAADM E E YEE SGLLETD EATLDTRKI VEACKAKTDAGG EDA I I>Q 
TRTYDI*Y ITYDKYYQTPRLW LFGYDEQRQ PLTVEHMYEDIS QDH 
VKKTVTI ENHPHIiPPPPMCSVHPCRHAEVMKKI I ETVAEGGGE1* 
GVHMYLLI FLKFVQAVIPTI E YDYTRHFTM 


5978 


160 


3213 


RDGARRWGGCQSPLTWAPG F YRRFDLATSGRRLRGQTAEPAGRQ 
RPRREPEAMDEQSVES IAEVFRCFICMEKLRDARiCPHCSKLCC 
FS C1RRW I/TEQ RAQ CPHCRAP LQLR EI*VN CRWAE E VTQ QIxD TI/Q 
LCSLTKHEEOTKDKCENHHEKLSVFCWTCKKCICHQCAIjWGGMK 
GGHTFKPLAE I YEQHVTKVNE EVAKLRRRLME L ISIjVQE VERNV 
EAVRNAKDERVREI RNAVEMM IARIiDTQLKNKIjI TLMGQKTSLT 
QETEIxLESLU)EVEHQIJ?S(^KSELISKSSEILMMFQQVHRKPM 
ASFVx^PVPPDFTSELVPSYDSATFVLEOTSxT^RQRADPVYSPP 
I>QVS GLCWR LKVY P DGNGWRG YYLS VFI* ELS AGI* PETS KYEYR 
VEMVHQS CNDPTKNI IREFASDFEVGECWGYNRFFRIiDIxIANEG 
YI^P^NDTVIIxRFQVRSPTFFQKSRI>2HWYITQIaEAAQTSYIC^ 
INNLKERliT I ELSRTQKSRDI*S P PDNHIxS PQNDDALETRAKXS A 
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ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
correspondi ng 
to fix's t 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 

nucleotide 

location 

c orre spending 

to first 

amino acid 

residue of 

amino acid 

sequence 


Amino acid segment containing signal peptide 
(A-Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=l»y3ine, 
L=Leucine, M=Methionine, N=Asparagine , 
P« Proline, Q=Glut amine, RsArginine, 
S=Serine, T=Thxeonine, V= Valine, 
W= Tryptophan, Y=Tyrosine, X=Unknown, *=stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 




* 




CSDI4LLER\GPYSAS\VREAKEDEEDEEKIQNEDYHHEbSDGDIi 

L/JjJL/Jj V X C>Lf£i V WyUAjoOOO/W C> 1A 1 or* 1 C» CtX* i-r X U C C» X I lOVj EtViLJ V 

EYIJNMELEEGEI^DAAAAGPAGSSHGYVGSSSRISRRTHLCSA 

ATSSLUDIDPLILIHLLDLKDRSS I EKLWGLQPRPPASliLQPTA 

SYSRKDKDQRKCWWWRVPSDI.KMIiKRliKTQ&^ 

TLSEI KSSSAASGDMQTSLFSADQAALAACGTENSGRXQDLGME 

IJJUCSSVANCYlRNSTNiCKSNSPKPAP^SVAGSI^LRRAVDPGE 

NS RS KGDCQTLSEGS PGS SQSGSRHSS PRAL IHGS I GD I LPKTE 

DR QCKALDS DA VWAVF SG LPAVE KRRKMVT LG AN AKGG HL.EGL 

QMTDLENNS ETGELQP VLPEGAS AAP EEGMS SDS DI ECDTENEE 

QEEHTS VGG FHDS FWVMTQPPDEDTHSSFPDGEQIGPEDLS FNT 

DBNSGR 


5979 


212 


3665 


liPDmWYJUWLKl.I^GFAFLDTEVFVTGQSPTPSPTDAyLNASE 
TTTLS P SGS AVI S TTT IATTPS KPTCDEKYANI TVD Yli YNKETK 
LFTAKLNVNENVECKNNTCTNNEVHNIjTECKNASVS ishnscta 
PDKTLIU3VPPGVEKVPVHCCS\QVEQPDSTIWLKWKNIETSTC 
DTQN ITYRFQCGNMI PDNKEI KLENLEPEHEYKCDSEILYNSHK 
FTNASKI IKTDFGSPGEPQI I FCRSEAAHQGVI TWNPPQRSFHN 
FTLCYI KETEKDCIiNI*DKNLI KYDLQNLKP YTKYVDSLHAY 1 1 A 
KVQRNGSAAMCHFTTKSAPPSQWNMTVSMTSDNSMHVKCRPPR 
DRNGPHERYHLEVEAGim^VRNESHKNCDFRVKDLQYSTDYTFK 
AYFHNGDYPGEP FXLHHSTS YNSKAL IAFLAFLI I VTS IALLW 
LYKI YDI^KRSCNLDEQQELVERDDEKQLMNVEPIHADI LLET 

ykrkiadegriiflaefqs i prvfskfp i kearkp fnqnknryvd 
h.pydynrvelseingdagsnyinasyidgfkeprkyiaaqgpr 
det vddfwrm i weqkatvi vmvtrceegnrnkcaeywpsmeegt 
rafgeccciojltkhkrcpXdyiiqklnivnkkekatgrevthiq 
ftswpdhgvpedphiillklrrrvnafsnffsgpivvhcsagvgr 

TGTYIG IDAMLEGLEAENKVDVYGYVVKLRRQRCIWVQVEAQY I 
1j1HQA1*VEYNQFGETEVNI^EI»HPYLHNMK3CRDPPSEPSPLEAE 
FQRLPS YRSWRTQHI GNQE \ ENKSKNRNSNVI P YD YNRVPI*KHE 
LEMSKESEHDSDESSDDDSDSEEPSKYIKASFIMSYWKP\EVMI 
AAOGPLKETIGDFWQMI FQRKVKVTVHLTELKHGDQE ICAQYWG 
EGKO/TYGDIEVDIjKDTDKSSTYTLRVFELRHSKRKDSRTVYQYQ 
YTNWSVEQI^AEPKELISMIQVVKOKLPQKHSSEGNKHHKSTPL 
IiIHCRDGSQQTGI FCALLNLLESAETEEWDIFQWKALRKARP 
GMVSTFEQYQFLYDVIASTYPAQNGQVKKNNHQEDKIEFDNEVD 
KVKQDANCVNPLGAPEKLPEAKEQAEGSEPTSGTEG PEHSVNG P 
ASPALNQGS 


5980 


3 


2363 

- 


DAWGCKLRRI^FTYGTQTRVSLALPGQyELVHTriVAHQGNWETI 
PEEDLEVQENKEDAAHDLTELEVTMHHALIjQEVDVVVAPCQGLR 
PT\fl3VLGDLVNDFIiP VITYALHXDELS ERDEQELQE I RKYFS FP 
VFFFKVPWLGSEIIDSSTRRMESERSPLYRQIjIDLGYI.SSSHWN 
CGAPGQDTKAQS MLVEQSE KLRHLSTFSHQVLQTRLVDAAKALN 
LVHCHCLDI FINQAFDMQRDLQITPKRLEYTRKKENELYESIMN 
IANRKQEEMKDM rVETLNTMKEELLDDATNMEFKDV I VPENGE P 

T TfTT} C T ty/T'TDO T /~\TT T T CDT KTORXJAMVT T C CUT1VT.PPC TTTTT^TT 
VVjXKil J.JP»L.L-iKy J. UhblldKunvnv/WlrujXoovuZJUJUior vo XX* 

ERCLQSI^KSQDVSVHITSNYLKQILNAAYHVEVTFHSGSSVTR 
MLWEQIKQI IQRITWVSPPAI TLE WKRKVAQEAI ES I>S AS KLAK 
SICSQFRTRIiNSSHEAFAASLRQLEAGHSGRLEKTEDLWLRVRK 
OHAPRIiARI^LESRSIjODVIjIjHRKPKIjGOEIjG^GOYGVVYLCDN 
WGGHFP CALKS WP PDE KHWNDLALEFH YMRSLPKHERI.VDLKG 
SVIDYNYGGGSSIAVLLIMERLHRDLYTGLKAGLTLETRLQIAL 
DVVEG IRFIJISOX3I»VHRDI KLKNVULrDKQNRAKI TDIK3FCKPEA 
MMSGS IVGTP iHMAPEIiFTGKYDNS VDVYAFGI 1*FW Y I CSGSVK 
LPEAFERCASKDHLVmNVRRGARPERLPVFDEECWQLMEACWI^ 
DPLKRPI»LGIVQPMIjQGIMNRI*C1CS\NSEQPNRGIjI)DST 


5981 


1- 


2519 


gr^saamejipwgaatkslsrwphglglllliqllppstlsqdrl 

dappppaaplprwsgpigvswglraaaa\ggafprggrwrrsap 
g Xedeecxsrvrdfvaxij^nthqhvfddlrgsvsi^ wgdstgv 
ilvl*ttfhvpijvimtfgqski.yr^edygknfkditdiiinntfik 
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! SEQ 
j NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


| Predicted end 
J nucleotide 
1 location 
1 corresponding 
1 to first 
I amino acid 
1 residue of 
1 amino acid 
I sequence 


1 Amino acid segment containing signal peptide "1 
1 (A=Alanine, C=Cysteine, D=Aspartic Acid, E= 

Glutamic Acid, F= Phenylalanine, G=Glycine, 
1 H=-Histidinc, I-Isoleucine, K=I»ysine, 1 

L=> Leucine, M=*Methionine, N=Asparagine, 1 
I P=Proline , Q=K3lutamine, R=Arginine, 

S^Serine, T=Threonine , V= Valine, 

N=Tryp tophan , Y=Tyrosine, X=Unknown, +=Stop 
1 Codon, /=possible nucleotide deletion, 
1 \=spossible nucleotide insertion) | 








TE FGMAI GPBKSGXWLTAE VSGGSRGGR I FRSSDFAKNFVQTD 
LPFHPLTQMMYSPQNSDYIiIiALSTENGbHVSKNFGGKMEEIHKA 
VCLAKWGSDNTI FFTTYANGSCKADLGALELWRTSDLGKSFKTI [ 
GVKIYSFGIXK3RFLFASVMAI3KDTTRRIHVSTDQGDTWS 
SVGQEQFYS ILAANDI^WFMHVDSPGDTGFGTI FTSDDRG I VYS 
KSLDRHLYTTTGGETDFTNVTSLiRGVYI TSVLSEDNSIQTMITF 1 
DQGGRWTHXjRKPENSECDATAKNKNECSIiHIHASYSISQKLNVP I 
MAPLSEPNAVG I VI AHGSVGDAI S VMVPDVYTSDDGGYSWTKML | 
EG PHYYT 1 LDSGG I X VAIEHSS RP INVI KFSTDEGQCWQ TYTFT 
RDPI YFTGLASEPGARSMNIS IWGFTBS FT.TSQWVS YTIDFKDI j 
LERNCEEKDYTIWIiAHSTDPEDYEDGCILGYKEQFLRLRKSSVC 
QNGRDYWTKQPS I CL CS LED FLCDFGYYRP ENDS KCVEQ PEL K 
GHDLEFCLYGREEHLTTNGYRKI PGDKCQGGVNPVREVKDLKKK 
CTSNFLSPEKQ^SKSNSVPIIIAIVGLMLVTVVAGVLIVKKYVC 

GGRFLVHLYSVMQH\AEA\NGVDGVEALDTASHTNKSGYHDDS 
DEDLLE j 


5982 


56 


]~ 2316 


ATR?PRGSSWC3^QFSRTASAAPGRSNMLRIPVRKALVGLSKSPK j 
GCVRTTATAAS NL I EVFVDGQ S VMVE PGTT VLQACE KVGMQ I PR 1 

FCYHERLS VAGNCRMCLVEIEKAP KWAACAMPVMKGWNI LTNS 
1 EKSKKAREGVMEFLLANHPLDCPICDQGGECDLQDQSMMFGNDR j 
SRFTiEGKRAVEDKNIGPLVKTIMTRCIQCTRCIRFASEIAGVDD 
LGTTGRGNDMQVGTYIEKMFMSELSGNI 1DICPVGALTSKPYAF 
TARPWETRKTESIDVMDAVGSNIWSTRTGEVMRILPRMHEDIN 

AGMLQ S FQG KD VAA I AGG LVD AE AL VAL KD LLNRVDS DTLCTE E 
VFPTAGAGTDLRSNYLI^TCIAGVEEADWLLVGTNPRt'EAPLF 1 
NARIRKSWLHNDLKVALIGSPVDLTYTYDHLGDSPKILQDIASG 
SHP FSQ VLKEAXKPMVVLGSSALQRND3AAI LAAVSS IAQKIRM 
TSGVTGDWKVMNII^RIASQVAAXJ)LGYKPGVEAIRKNPPKVLF 
LLGADGGCITRQDLPKDCFIIYQGHHGDVGAPIADVILPGAAYT 
EKSATYVNTEGRAQQTKVAVTPPGLAREDWKI I RALSEIAGMTL 
P YDTL \ DQ VRNR LEEVS PNLVRYDDI EG \ ANYFQQANELS KLVN 
QQLXiAD PLVP PQLTWKDFYMTDS I SRASQTMAKCVKAVTEGAQA 
VEEPSIC 


5983 


248 


1763 


EARGDGGRRRHRASGRRAGRGEP \AGLKSQGQRAVPKRAVARGG 1 
RQ\ YS AAIALLE P AGSE IADDLS I LYS NRAACYLKEGNCSG C I O 
DCNRAI^I^PFSMKPLLRRAKAYETLEQYGKAYVDYKTVLQIDC 
GLQ LANDS VNRLS RILMELDG PNWRE KLSL I PAVPASVPLQAWH 1 
PAKEM I S KQAG D S S SHRQQG I TDEKTFXALKEEGNQCVNDKNYK j 
DALSKYSECLKINNKECAIYTim^CYLKLCQFEEAKODCDQAL 1 
QLADGNVKAF YRRAIJVHKGLKN YQ KS L I DLN KVI LLD P S 1 1 E AK 
MELEEVTRIiLNLKDKTAPFNKEKERRKIEIQEVNEGKEEPGRPA J 
GEVSTGCLASEKGGKSSRSPEDPEKLPIAKPNNAYEFGQIINAL 
STRKDKEACAHLIAITAPKDLPMFLSNKLEGDTFIjLLIQSLKNN 
LIEKDPSLVYQHIiLYIjSKAERFKMMLTIjI skgokeli eqlfedl 

sdtpiwhftlediqalkrqyel 1 


5984 


755 1 


1193 


ssvcmactyvsnlgkxqrsvsflasgli^vstgpelrlhhsfvl 
tgdvgrri crllvglftkgdtss krvhpfs pgpcfllcdlar vg ! 
sspkinvs pfyqn\qtstqrs ctvfvwqrcslvgpfqvtvftmy 

FHHSLRS I S RFSSG | 


5985 


22 


1408 


RR VARPGTAE P AKARRTVRRGRARRDLAGAERKAGVS ERQDS GR ! 
RRPNPS I PSAAAGMSHIQI PPGLTELLQG YrVEVLRQQP PDLVE 
FAVEYFTRLREARAPAS VLPAATPRQS LGHP PPEPG PDRVADAK 
GDS ES EEDEDLE VPVPSRFNRRVSVCAETY1TPDEEEEDTDPRVI 
HPKroEQRCRLQEACKDILLFKNLDQEQLSQVXiDAMFERIVKAD 
EHVI DOGDDGDNF YVI E RGTYD I LVTKDNQTRSVGQ YDNRGS FG 1 
ELALMYNTPRAATI VATSEGSLWGLX)RVTFRRI I VKNNAKKRKM ! 
FESFI ES VPLLKSLEVSERMKIVDVIGBKI YKR/DGERI ITQGE 
K\ADSFYI I ESGEVS ILIRSRTKSNKDGGNQEVEIARCHKGQYF j 
GELALVTNKPRAA£AYAVGDVKCLVMDVQAFERLTX?PCMDir^KR 
NISHYEEQLVKMFGSSVDLGNLGQ J 
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SEQ 
ID 

HO: 



5986 



Predicted, 
beginning 
nucleotide 
location 
corre spending 
to first 
amino acid 
residue of 
amino acid 
sequence 



1606 



/ 



Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



484 



Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspaxtic Acid, B= 
Glutamic Acid, F= Phenyl alanine, G=Glycine , 
H=Hietidine, I=Isoleucine, K= Lysine, 
L=Leucine, M= Methionine , N=Asparagine, 
P= Proline, Q=Glutamine , R=Arginine , 
S=Serine, T=Threonine , V= Valine, 
W= Tryptophan , Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\s=possible nucleotide insertion) 



BAWKSTS LTPHWKLWGRHRGRRRGLAHPKNHLS PQQGGATPQV P 
SPCCRFDSPRGPPPPRLGIJJGALMAEDGVRGS PPVPSGPPMEED 
G1iRWTPKSPIxDPDSGLI>SCTLPNGFGGQSGPEGERSIjAPPDAS I 
LISNVCS IGDHVAQEI,FGGSDLGMAEEAERPGEK\AGQHSPLRE 
EHVTCVQSILDEFLQT\YGSLI PLSTDEVVEKLEDI PQQEFSTP 
S RKGL VLQ 1. 1 Q S YQRMPGNAMVRG FR VA YKRHVLTMDD LGTL YG 
QNWLNDQVMNM YGDL VMDTVP E K\ VHF FNS FFY \DKLRTKG YDG 
VKRHTKim)IFNKEI^IPIHI*EVHWSI,ISVDVRRRTITYF 
RTLNRRCPKH I AKYLQAEAVKKDRLDFHG^WKGYFKMNVARQNN 
DSDCGAFVLQYCKHLALSQPFS FTQQDM P KLRRQIYKELCHCKL 
TV 



5987 



1806 



484 



DAMKSTSLTFHWKLWGRHRGRRRGLAHP KNHLS PQQGGATPQVP 
SPCCRFT)SPRGPPPPRl^LliGALMAEDGVRGSPPVPSGPPMEED 
GLRWTPKSPLDPDSGIiLSCTLPNGFGGQSGPEGERSLAPPDASI 

li suves igdhvaqelfqgsdixsmaeeaerpgekYagqhsplre 

EHVTCVQS ILDEFLQT\ YGSLI PLSTDEWEKLEDI FQQE FSTP 
SRKG LVLQL I QS YQRM PGNAMVRGFR VA Y KRHVLTMDDLGTLYG 
CNWLNDQVMNOTGDLVMDT^ 

VKRV7TKNVDI FNKELLLIPIHLEVHWSLISVDVRRRTITYFDSQ 
RTLNRRCPKH IAXYLQAEAVKKDRLD FHQGW KG YFKMNVARQNN 
DSDCGAFVLQYCKHIJ^SQPFSFTQQD MP KLRRQIYKELCHCKL 
TV 



5988 



1292 



410 



FKKYFLS FLGLLE S SHSRDR I HNLVLMFLiLATHNLVWW FTCRFQ 
RLDCI YLNAG I MPNPQLNI KALLFGLFS \ AEGLLTQGDKI TADG 
LQEVFETDVFGHF I LIRELEP LLCHSDN PSQL I WTS SRNARKSN 
FSLEDFQHS KGKE P YSS S KTYATT)LLS VALNRNFNQQGL YSNVAC 
PGTALTNLTYGILPPFIWTLIMPAIia^RFFANAFTLTPYNGTE 
ALVWLFHQKPBS LNPL I KYLS ATTGFGRNY IMTQKMDLDE0TAE 
KFYQKLLELEKHIRVTIQKTDNQARLSGSCL 



5989 



194 



2610 



AMDFPQHSQHVLEQLNO^RQLGLLCDCTFVVDGVHFKAHKAVLA 
ACSEYFKMLFVDQKDVVHLDI SNAAGLGQVLEFMYTAKLSLSPE 
NVDDVL\AVATFLQMQDI ITACHALKSLAEPATSPGGNAEALAT 
EGGDKRAfCEEKVATSTLSRLEQAGRSTP I G PSRDL KE ERGG QAQ 
SAASGAEQTEKADAPREPPPVELKPDPTSGMAAAEAEAALSESS 
EQEMEVEPARKGEEEQKEQEEQEEEGAGPAEVKEEGSQLENGEA 
PEENENEESAGTDSGQELGSEARGLRSGTYGDRTESKAYGSVIH 
KCEDCG KE FTHTGNFKRH I RIHTGEKP FS CRECSKAFSDPAACK 
AHEKTESPLKPYGCEEaSKSYRLISLLOTjRKKRHSGEARYRCED 
CGKLFTTSGNLKRHQLVHSGEKPYQCDYCGRSFSDPTSKMRHLE 
THDTD KEHKC PHCD KKFNQ VGN L KAHL KI H IADG P LKCRECG KQ 
FTTSGNLKKHI^IHSGEKPYVCIHC^RQFADPGAI^RHVRIHTG 
EKPOQCVMCGKAFTQAS SLIAHVRQHTGEKP YVCERCGKRFVQS 
SQIANHIRHHDNIRPHKCSVCSKAFVNVGDLSKHII IHTGEKPY 
LCDKCGRGFNRVDNLRSIIVKTVHQGKAG I KI LEPEEGSEVS WT 
VDDMVTI^TEALAATAVT13LTVVP VGAAVTADETE VlaKAEI S KA 
VKQVQEED PNTHILYACDSCGDKFLDANS LAQHVRIHTAQALVM 
FQTDAD F YQQ YGPGGT W P AGO VLQ AG EL V FR P RDG AEG Q P ALAE 
TSPTAPECPPPAE 



5990 



2 



4700 



FGPGPDSGGGARGSGVJGSRSQAPYGTLGAVSGGEQVLLHEEAGD 
SGFVSI^RIjG PSLRDKDLEMEELMI^DETLLGTMQS YMDASL I S 
LIEDFGSLGEVEMSLPDPSWDFSPPSFLETSSPKLPSWRPPRSR 
PRWGQ5PPPQQRSDGEEEEEVASFSGQILAGELDNCVSS I PDFP 
MHLACPEEEDKATAAEMAVPAAGDES ISSLSELVRAMHPYCLPN 
LTHLASLEDELQEQPDDLTLPEGCVVLEIVGQAATAGDDLEIPV 
WRQVSPGPR PVLLDDSLETSSALQLLMPTLESETEAAVPKVTIi 
CS EKEGLS LNS EEKIJDS ACXLKP RE^n/EPVVPKE PQNPPANAAP 
GSQRARKGRKKKSKEQPAACVEGYARRLRSSSRGQSTVGTEVTS 
QVDNl^KQPQEEXiQKESGPLQGKGKPRAWARAWAAALENSSPKN 
LERS AGQS5 PAKEGPLDLY PKLADTI QTNPI PTHLSLVDS AQAS 
P«PVDSVEADPTAVGPVIAGPVPVDPGLVI)LASTSSEI»VEPLPA 
EPVLINPVIiADSAAVDPAVVPISDNLPPVIlAVPSGPAPVDLALV 
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SEQ 
XD 
NO: 


Predicted 
beginning 
nucleotide 
location 
corre spondi ng 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corre sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A«= Alanine, C=Cysteine, D=Aspartic Acid, E=* 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K= Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P= Proline, Qs=Glut amine, R=Axginine, 
S= Serine , T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X-Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 


• 






DPVPNDI/T PVDP VLVKSRPTDPRRGAVSSALGGSAPQIjLVES ES 

ij>pp:<tiipetvtcevvdslkiesgtsatthearprplsi>seyrrr 
rqqrqaetetrspqpptgkwpslpbtptgiiadi pclvi ppapak 
ktalqrs petple i clvp vgps pas ps pe p pvs kpvas s pteqv 

PSQEMPLLARPSPPVQSVSPAVPTPPSMSAAIJ>FPAGGLGMPPS 
LPPPPLQPPSLPLSMGPVLPDPFTHYAPLPSWPCYPHVSPSGYP 
CLPP ? PTVPLVSGTPGAYAVPPTCSVPWAPPPAPVSP ysstcty 
GPIiGWGPGPQHAPFWSTVPPPPLPPASIGRAVPQPKMESRGTPA 
GPPENVLPLSMAPPLSLGLPGHGAPQTEPTKVEVKPVPASPHPK 
>^SALVQSPQMKALACVSAEGVTVEEPASERLKPETQETRPRB 
KPPLPATKAVPTPRQSTVPKL,PAVHPARLRKLSFI>PTPRTQGSE 
DWQAFISE IG I EASDLSSLtEQ FEKSEAKKECP PPAPADSLAV 
GNSGGVD I PQE KRPLDRLQAPELANVAGXT PPATPPHQliWKPLA 
AVSLLiAKAKS PKSTAQEGTLKPEGVT3AKHPAAVRLQEGVHGPS 
RVHVGSGDHDYC\ VRSRTP PKK\ MP ALLI PEVGSRWNVKRHQD I 
TIKPVLSLGPAAPPPPCIAASREPLDHRTSSEQADPSAPCLAPS 
SLIiS PEASPCRNDMNTRTPPEPS AKQRSMRCYRKACRSAS PSSQ 
GWQGRRGRNSRSVSSGSNRTSEASSSSSSSSSSSRSRSRSLSPP 
HKRWRRSSCSSSGRSRRCSSSSSSSSSSSSSSSSSSSSRSRSRS 
PS PRRI^DPJIRRYS S YRSHDHYQRQRVIjQKERA I EERRWFIGK 

ipgrmtrselkqrfsvfgeieectihprvqgdnygfvtyryaee 
afaaiesghki*rqadeqpfdi»cfggrrqfckrsysdi*dsnredf 
dpapvkskfdsldfdtllkqaqknlrr 


5991 


334 


1379 


RLSSHFSQCS PS I YC \TKFDKQGNVTS FERKKTELYQEIX3LQAR 
D LRFQHVMS I TVRNNR I IMRME YLKAVITP ECLLILDYRNUSTLK 
QVTLFR3LPSQLSGEGQLVTYPLPFEFRAIEALLQYWINT3X?GKL 
S I LQPLI LETLDAIjGDP KHS S VDRS KI*HI LLQNG KSLS E L ETD 1 
. KI FKESILBILDEEELLEELCVSKWSDPQVFEKSSAGIDHAEEM 
ELLLENYYRJOADDLSNAAREDRVLIDDSQS i ifinldshrnvmm 
RLNLQIjTMGTFS LSLFGLMGVAFGMNIjESS LE EDHRX FWL I TG I 
MFMGSGIiIWRRLLSFLGR/LARSS IASYGMKDMVHGGIVEGIj 


5992 


2 


609 


AGPDFRLVCGVSGSGFPGGRQGQATEWRPLRPWNGAMEKIjRRVI* 
SGQDDEEQGLTAQDSQINIj/SEVLDASSLSFNTRLKMFAICFVC 
GVFFS I LGTGLLWLPGG I KLFAVF YTLGNLAALASTC FLMG PVK 
QLKKMFEATPXIATIVMLLCTIFTLC^^ 
FLSMTWYSLSYI PYARDAVI KCCSSLLS 


5993 


1650 


594 


AEGLGSWAVWAGIX^WAGRHMEAGGATGAliGVGCKLPSAFCFPGS 
SVAMDMFQKVE KIG EGTYGVVYKAKNRETGQLVALKK1 RLDLEM 
EGVPSTAIRE I SLLKELKHPNIVRiLDVVHNERKLYIiVFEFLSQ 
DIJCKYt4DSTPGSEI>PI^IKSYI^QLLQGVSFCHSHRVIHRDLK 
PQNLLINELGAI KLADFGLARAFGVPLRTYTHE WTLWYRAPEI 
LIATRFYTTAVDIWSIGCIPAEMVTRKALFPGDS\EIDQ\LFRI 
FRMLGTPSEDTWPGVTQLPDYKGSFPKWTRKGLEEIVPNLEPEG 
RCLLMQLLQYDPSQRITAKTALAHPYFSSPEPSPAARQYVIjQRF 
RH 


5994 


394 

* 


1934 

• 


AGEVQLHVWI RGMRIQPQ/XAAAI IDIjDPDFEPQSRPRSCTWPL 
PRPEIANQPSKPPEVEPDLGEKVHTEGRSEPILLPSRXPEPAGG 
PQPGI1X3AVTGPRKGGSRRNAWGNQSYAELISQAIESAPEKRLT 
IAQIYEWM\TOTVPYFKDKGDSNSSAGWKNSIRHNl*SliHSKFIICV 
HNEATGKSSWWMI^PEGGKSGKAPRRRAASMDSSSKIiLRGRSKA 
PKKKPSGLPAP PEGATPTS PVGHTAKWSGSPCSRNREEADMWTT 
FRPRSSSNASSVSTRIjSPLRPESEVLAEEIPASVSSYAGGVPPT 

lneglelldglnltsshsllsrsglsgfslqhpgvtgplhtyss 
slfspaegplsagegcfsssqalealltsdtppppadviwtqvd 
pilsqaptiili/mlpssskijmsvglcpkpleapgpsslvptl 
smiapppvmasapi pkalgtpvltppteaasqdrmpqdldldmy 
menlecdmdni isdlmdegegldfnfepdp 


5995 


2 


2437 


RPPGPGPASGAWLCTRARGSAAFVPPLPRPPSRGARRRRRiPGR 
GVAAIiRRGPGSAPGLPRGRAERSAAGSGRGPSREERGAAAAAAA 
AEMMEEIiHSL\DP\RRQELLEARF\TGliGVSKGPIiNSESSNQSL 
CSVGS1*S DKEVETPEKKQNDQRNRKRKAEPYETS QGKGTPRGHK 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponds ng 
to fir3t 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
correspond! ng 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D- Aspartic Acid, E= 
Glutamic Acid, ^Phenylalanine , G=Glycine, 
H=Histidinc, I^Isoleucine, K^Lysine, 
L= Leucine, M= Methionine, N=Asparagine, 
P-Proline, Q=Glutaraine, R=Arginine, 
S=Serine, T^Threonine, v=Valine, 
W=Tryptophan, Y=Tyrosine, X= Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








ISDYFERRVEQPLYGIJX5SAAKEATEEQSAIj?TIi^VMLAKPRLi 
DTEQLAQRGAGLCFTFVSAQQNSPSSTGSGNTEHSCSSQKQI S I 

QHRQT\QSDLTI eki SALENS knsdlekkegriddllrancdlr 
RQ I \deqq kmlekyk\ erlnrcfdne prnfli bks kqekmacrd 

KSKQDRIiRIXaHir 1 rVRHGAS r ri>QW i Ct? Y Ar QNLI KQQcRXNSQ 
REE IERQRKMLAKRKP PAMGQAP P ATNEQ KQRKS KTNGAENBTL 
TLAEYHEQEE I FKLRliGHLKKEEAEIQAELERLERVRNLHIREL 
KRIHNEDNSQFKDHPTI^DRYLLLJrLLGRGGFSEVYKAFDLTEQ 
RYVAVKIHQ^KNVTRDEKKENYHKHACREYRIHKELDHPRIVKL 
YD Y PS LDTDS FCTVLE YCEGNDLDFY LKQH KLMSEKEAKSXlPtQ 
IVNALKYLNEI KPP 1 1 H YDLKPGNI LLVNGTACGE I KI TDFGLS 
KIMDDDSYNSVDGMELTSQGAGTYWYLPPECFWGKEPPKISNK 
VDVWSVGVIFYQCLYGRKPFGHNQSQQDILQENTILKATEVQFP 
PKPVVTPEAKAFIRRCLAYRKBDRIDVQQLACDPYLLPH I RKS V 
STSS PAGAAIAS TSGASNNSSSN 


5996 


1612 


981 

w 


DQQACLLGLMLTLEFGILEFDPSWIGSWTQR/SWVSWRSRPGCE 
LFS I WFGS I VNEG YLNSASEGEEFCI YNRNPNACS YGVAVGVL 
AFLTCLLYLALDVYFPQISSVKDRKK\AVI»SGHPVVSGEPHPAA 
FWAFLWFTGDS CYL\ANQWQVS KPKDNPLNEGTDAS PGRPSPFS 
FFS I FTWS LTAALAVRRFKDLS FQEEYSTLFP \ ASAQP 


5997 


1612 


981 


DQQACLLGLMLTLEFG I LEFDPS WIGSWTQR/ SWVSWRSRPGCE 
LFS I WFGS IVNEGYLNSASEGEEFC I YNRNPNACS YGVAVGVL 
AFLTCLLYLALDVY FPQ I S SVKDRKK\ AVLSGHP WSGEPHPAA 
FWAFLWFTGDS CYL\ANQWQVS KPKDNPLNEGTDAS PGRPS PFS 
FFSI FTWSLTAALAVRRFKDLS FQEEYSTLFP \ASAQP 


5998 


1612 


981 


DQQACLLGLMLTLEFGILEFDPSWIGSWrQR/SWVSWRSRPGCB 
LFS I WFGS I VNEGYLNS ASEGEE FC I YNRNPNACS YGVAVGVL 
AFLTCLLYLALDVYFPQ I S S VKDRKK\AVLSGHP WSGE PHPAA 
FWAFLWFTGDSCYL\ANQWQVS KPKDNPLNEGTDAS PGRPSPFS 
FFS I FTWSLTAALAVRRFKDLS FQEEYSTLPP \ASAQP 


5999 


2 


1790 


RPPMEKARRGGDGVPRGPVLHIWVGFHHKKGCQVEFSYPPLIP 
GDGHDSHTLPEEWKYLPFLALPDGAHNYQEDTVFFHLPPRNGNG 
ATVFG I SCYR \ Q IEAKALKVRQADITRETVQKS VCVLSKL PL YG 
LLQAKLQL 1THA Y FEEKDFS QIS I LKELYEHMNS SLGGAS LEGS 
QVYLGLSPRDLVMFRHKGLILFKLILLEKKVLFYISPVNKLVG 
ALMTVLSLFPGM IEHGLSDCSQYRPRKSMSEDGGLQESNPCADD 
FVSAS TADVSHTNLGT I RKVMAGNHGEDAAMKTEE PLFQVEDSS 
KGQEPNDTNQYLKPPSRPSPDSSESDWETLDPSVLEDPNLKERE 
QLGSDQTNLFPKDSVPSESLPITVQPQANTGQWLI PGLISGLE 
EDQYGMPLAI FTKGYLCLPYMALOQHHLLSDVTVRGFVAGATNI 
LFRQQ KHLSDAI VE VEEALI Q I HDPEIjRKLLNPTTADLRFAD YL 
VRHVTENRUDVFLDGTG WEGGDEW IRAQFAVY I HALLAATLQL V 
LFRI VNVAKKI GNVMVTT \ S RNWQTGK\AVG QS VGG AFS \ S AK 
TA\MSS WLSTFTTS TSQSLTEPPDEKP 


6000 


101 

* 


1561 


TEPCRTAENCTATMSENNKNSLESSLRQLKCHFTWNLMEGENSL 

KABELIO^EHADQAEIRSLVTWGNYAWVYYHMGRLSDVQIYVDK 
VKHVCEKFSS P YR I SSPELDCEEGWTRLKCGGNQNERAKVCFEK 
ALEKKPKNPEFTSGLAIASYRLDNWPPSQNAIDPLRQAIRLNPD 
NOYIJCVIJjAIJajHKMREEGEEEGEGEKXLVEEALEKAPGXVTDV 
LRSAA\KFYRGKDEPDKAI ELLKKALE Y I P \NNAYLHCQIGCCY 
RAKVFQVMNLRBNGt^YGKRKLLELlGHAVAHLKKADEANDNLFR 
VCS I LAS LHALADQ YEDAE YY FQ KEFS KELT PVAKQLLHLR YGN 
FQLYQMKCEDKAZHHFIEGVKINQKSREKEKMKDKLQKIAKMRL 
SKNGADSEALHVLAFLQELNEKMQOADEDSERGLESGSLIPSAS 

SWNGB 


6001 


176 


1038 


AFAHSPSRGHRKTHIHTPRHTPRCTMAESHLQSSLITASQFFE I 
WLHFDADGSGYLEGKEXiQNL I QELQQARKKAGLELS PEMKTFVD 
QYGQRDDGK1GI VELAHVLPTEENFLLLF11CQQLKS CE\ EFMKT 
WRKYDTDHSGFI ETEELKNFLKDLLEKANKTVDDTKLAEYTDLM 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
cor respond i ng 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A= Alanine , C=Cysteine, D=Aspartic Acid, E«= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K= Lysine, 
L=Leucine, ^Methionine, N-Asparagine, 
P=Proline, Q= Glut amine, R=Arginine, 
S=Serine, T=Thxeonine, V=Valine, 
WsTryp tophan , Y= Tyrosine , X-Unknown , * =S top 
Codon, /^possible nucleotide deletion, 
V=possible nucleotide insertion) 








LKLFDSNNDG KLELTEMARLLP VQENFLLKFQG I KMCGKEFNKA 
FEL YDQDGNG Y I DENE LDALLKDLCE KNKQ DLD I NN I TTYKKNI 
MALSDGGKLYRTDLALILCAGDN 


6002 


9*77 


81 


LAP PGGGLHI PPRT PLS HSRP P PSHHAPHPS P LP LP P ADLH PHS 
SMAQRSDIJjELDCQLTRDRVVWSHDENIX3iQSGLimDVGSlJDF 
EDLPLYKEKiEVYFSPGHFAHGSDRRMVRLEDLFQRFPRTPMSV 
EI KGKNEELIREQ/ VLVRRYDRNEITIWASEKSSVMKXCKAANP 
EMPLSFTISRGFWVLLS Y YLGLLPFI P 1 PEKFFFCFLPNT TNRT 
YFPFSCSCLNQLLAWSKWLIMRKSLIRKIjEERGVQVVF^CLNE 1 
ES DFEAAFSVQATG VI TD YPTALRHYLDNHGPAARTS 


6003 


140 


4098 


GKLRAFRGMRRLICKRI CDYKSFDDEESVDGNRPSSAASAFKVP 
AP KTSGNP ANSARK PG SAGG P KVGAGAS KEGG AGAVDE DDF I KA 
KTD VPS IQ IYS SRELEETLNKI RE ILSDDKHD WDQRANALKK1R 
S LL VAGAAQ YDC FF QHLRLLDGAL KLSAKDLRS Q WREACI TVA 
KI^TVIiGNKFDHGAEAIVPTLFNLVPNSAKVMATSGCAAIRFI I 
RHTHVPRLI PLITSNCTS KSVPVRRRSFEFLDLLLQEWQTHSLE 
RHAAVLVETIKKGIHDAI>AEARVEARKTYMGI,RNHFPGEAETLY 
KSLEPSYQKSLQTYLKSSGSVASLPQSDRSSSSSQESLNRPFSS 
KWSTANPSTVAGRVSAGSS KASSLPGSLQRSRSDIDVNAAAGAK 
AHHAAGQS VR S GRLG AGALNAG S YAS LEDTSD KLDGTAS EDGRV 
RAKLS APLAGMGNAKADSRGRSRTKMVSQSQPGSRSGS PGRVLT 
TTALSTVSSGVQRVLVNSASAQKRSKXPRSQGCSREASPSRLSV 
ARSSRI PRPSVSQGCSREASRESSRDTSPVRSFQPLASRHHSRS 
TGALYAPEVYGASGPG YG I SQSSRLS S SVSAMR VTjNTGSDVEEA 
VADALLLGD IRTKXKP ARRR YES YGMHSDD DANS DASSACSERS 
YSSRNGSIPTYMRQT\BDV\AEVLNRCASSNWS ERKEGLLGLQN 
LLKNQRTLSRVEI*KRLCEIFTRMFADPHGKRVFSMFLETLVDFI 
QVHKDDLQDWLFVLLTQLLKKMGADI^^ 

FPNDLQFNILMRFTVDQTQTPSLKVKVAILKYI ETLAKQMDPGD 
FINSSETRLAVSRVITWTTEPKSSDVRKAAQSVLISLFELNTPE 
FTMLLGALPKTFQDGATKLLHNHLRNTGNGTQSSMGSPLTRPTP 
RSPANWS S PLTS PTNTS QNTLS PSAFDYDTENMNS EDI YSSLRG 
VTEAIQNFSFRSQEDMNEPLKRDSKKDDGDSMCGGPG\MSDPRA 
GGDATDS SQTAL\ DNKASLLHS MPTHSS PRSRD YNPYNYS D S I S 
PFmSALKEAMFDDDADQFPDDLSLDHSDLVAELLKRr >SNHNER 
VEERKIALYELMKLTQEES FS VWDEHFKTILLLLLETLGDKEPT 
IRALALKVLRJ2 IIjRHQPARFIOfl YAELTVMK^ 

SAEEAAS V\ LATS I \ S PEQC I KVLCP I IQTAD YP I NLAAI KMQT 
KVIERVSK2TLNLLLPEIMPGLIQGYDNSESSVRKACVFCLVAV 

HAVIGDELKPHLSQLTGSKMKLLNL Y I KRAQTGSGGADPTTD VS 
GQS 


6004 


140 


4098 


GKLRAFRGMRRL I CKR I CD YKS FDDEES VDGNR P S S AAS A FKVP 
APKTS GN PANSARKPGS AGGP KVGAGAS KEGGAGAVDEDDF I KA 
FTDVPSIQIYSSRELEETLNKIREILSDDKHDWDQRANALKKIR 
S LL VAGAAQ YDCFFQHLRT JiTXSALKLS AKDLRSQVVREACI TVA 
RLSTVLGNKFDHGAEAIVPTLFm*\/PNSAKVMATSGCAAIRFI I 
RHTHVPRLI PLITSNCTS KSVPVRRRS FEFLDLLLQEWQTHSLE 
RHAAVLVETI KKG I HDADAEARVEAR KT YMGLRNH FPGEAETLY 
NS LiEPS YQKSLQT YLKS SGS VAS LPQSDRS S SSSQESLNRP FS S 
KWSTANPSTVAGRVSAGSSKASSLPGSLQRSRSDI DVNAAAGAK 
AHHAAGQSVRSGRLGAGALNAGS YASLEDTSDKLDGTAS EDGRV 
RAFCLSAPLAGMGNAKADSRGRSRTKMVSQSQPGSRSGS PGRVLT 
TTAI*STVS SGVQRVLVNS ASAQKRS KI PRSQGCSREAS PSRLSV 
ARSS RIPRPS VSQG CS REASRES SRDTS PVRS FQPLASRHHSRS 
TGALYAPEVYOASGPGYGISQSSRLSSSVSAMRVLNTGSDVEBA 
VADALLIX5DIRTKKKPARRRYESYGMHSDDDANSDASSACSERS 
YSSRNGSIPTYMRC/T\EDV\AEVIjNRCASSN1^SERKBGLLGLQN 
LLKNQRTLSRVELKRLCEIFTRMFADPHGKRVFSMFLETLVDFI 
QVHKDDLQDWLFVLLTQLLKJCMGADLLGSVQAKVQKAIJDVTR^ 
FPNDI^FNILMRFTVDQTQTPSLKVKVAILXYIETI^ 
F INS S ETRLAVS RVI TWTTEP KS S D VRKAAQS VLISLFELNTPE 



423 



BNSDOCID: <WO 0153312A1_I_> 



WO 01/53312 



PCT/USO0/34263 



SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
correspond i ng 
to first 
amino acid, 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A-Alanine, C-Cysteine, D^Aspartic Acid, E=. 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine # K«L»ysine, 
L= Leucine, M=Methionine, N»Asparagine, 
P=»Proline,, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine , V=Valine, 
W=Tryptophan, Y= Tyrosine, X-Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








FTMIJjGALP KTPODGATKLLHNHIiRNTGNGTOS S MGS PLTRPT P 
RS PANWSS PLTSPTNTSQNTI»S PSAPD YDTENMNSED I YSSLRG 
VTEAIQNPS FRSQEDMNEPLKRDSKKDDGDSMCGGPG \MSDPRA 
GGDATDS SQTAL»\DNKASLIxHSMPTHSS PRSRDYNP YNYSDS I S 
PFNKSALKEAMFDDDAIX^FPDDLSLDHSDLVAELIiKEUSNHNER 
VEERKIAL YELMKLTQEESFS VWDEHFKT I LL.TJ J .KTLGDKE PT 
IRALALK>OjREIIJIHQPARFKNYAELTVMKTLEAHKDPHKEV^ 
SAEEAASV\LATSI \SPEQCI KVLCPI I QTADYPINIAAX KKQT 

KVI ERVSKETLNLLLPE impgliqg ydnsess vrkacvfclvav 

HAVIGDE LKPHLSQLTG S KMKLLNL. YI KRAQTGS GGAD PTTDVS 
GQS 


6005 


133 


5955 


RSSGRRQEQLGQFPGRERKGMASGLGSPSPCSAGSEEEDMDAIJj 
NNSLPPPHPENEEDPEEDLSETETPKJLKKKKKPKKPRDPKIPKS 
KRQKKERMLI*CRQLGDSSGEGPEFVEEEEEVAI*RSDSEGSDYTP 

gkkxkkklgpkkekks xs krkeeeeeddddddds kepkss aqli, 
bdwgmedidhvfseedyrtt.tnykafsqfvrpl.iaaknpkiavs 
kmmmvlgakwrefs tnnp fkg s s gas vaaaaaaavavves mvt a 
tevapppppvevpirkaxtkegkgpnarrkpkgsprvpdakkpk 
pkkvaplki klggfgs krkrsssedddiidvesdfddas ins ysv 
sdg sts rs s rs rkkiirttkkkkkgeeevtavdg yetdhqdycev 
cqoggeiilcdtcprayhmvcldpdmekapegkwscphcekegi 
qweakednsegeeileevggdleeeddhhmefcrvckdggellc 
cdtcpsstoihclnpplpeipngewlcprctcpaljcgkvqiali 

WKWGQP PS PT PVPRP P DADPNTPS PKPLEGRPERQFPVKWQGMS 
YWHCSWVSELQLELHC\QVMFRNYQRKNDMDEPPSGDFGGDEEK 
S\RKRKNKDPKFAEMEERPYRYGIKPEW\MMIHRILNHSVDKKG 
HVHYLI KWRDLPYDQAS WESEDVE I QDYDI»FKQS YWNHRE LMRG 
EEGRPGBCKLKKVKIiRKI^RPPETPTVDPTVKYERQPEYIjDATGG 
TlaHPYQMEGLNWIjRFSWAQGTDTII^ADEMGLGKTVQTAVFLYSL. 
YKEGHSKGPFXVSAPIiSTIIN\WEREFEMWAPDMYV\VTYVGDK 
DSRAI I RENE FS \FEDNAI RGGKKASRMKKEAS V KFHVLLTS YE 
LIT I DMAILGS IDWACI>I VDEAHRL KNNQS KFFRVLNG YSLQHK 
LLLTGTPI^NNLEELFHIiLNFIjTPERFHNXEGFIfEEFADIAKED 
QIKKLHDMLG\ PHMLRRLKADVFKNMPSKTELI V\RVEI*SPM\Q 
KKYYKXYILHSKFLKALNXARGGGNQVSLJLNVVMDIjK^ 

I^PVAAMEAPKMPNGMYIXSSALIRASGKLIjIM 

RVLI FSQ^T^K^a^I*LEI)FLEHEGYKYERIDGGITGNMRQEAIDR 

FNAPGAQQFCFIjLSTRAGGLiG INLATADTVI I YDS DWNPHND I Q 

AFSRAHR I GQNKKVM I YRFVTRASVEER I TQVAKXKMMLTHLW 

RPGLGSKTGSMSKQELDDILKFGTEELFKDEATDGGGDNKEGED 

S SV I HYDDKAI ERLLDRNQDETEDTEIjQGMNB YL SS FKVAQ YW 

REEEMGEEEEVEREIIKQEESVDPDYWEKI^IiRHHYEQQQBDLAR 

NIXjKGKRIRKQVNYNDGSQEDRDWQDDQSDNQSDYSVASEEGDE 

DFDERSEAPRRPSRKGXRNDKDKPLPPIJjARVGGNIEVLGFNAR 

QRKAFLNAJ^RYGMPPQDAF-ITQVTLVRDIjRGKSEKEFKAYVSLF 

MRWLCFPG ADGAET FADGVPREGLiSROH VLTR I GVMS LIRKKVO 

EFEHVNGRWSMPEIAEVEENKKMSQPGSPSPKTPTPSTPGDTQP 

ntpapvp paedgi KIEENSLKEEES IEGEKEVKSTAPETAI ect 

QAPAPASEDEKVVVEPPEGEEKVEKAEVKERTEEPMETEPKGKG 

AADVEKVEE KSAI DIiTPI WEDKEE KKEEEEKKE VMIiQNGETP K 

DLNDEKQKXNIKQRFMFNIADGGFTEIiHSLWQNEERAATVTKXT 

YEIWHRRHDYWIJAGIINHGYARWQDIONDPRYAILN3PFKGEM 

NRGNFLEI KNKFLARR FKLLEQALV I EEQLRRAAYLNMSED PSH 

PSMALNTR FAEVECIJ^ESHQHLS KESMAGNKPANAVIJIKVI*KQI» 

EELLSDMKADVTRLPATIARI PPVAVPXQMSERNILSRLANRAP 

EPTPQQVAQQQ 


6006 


1 


965 


DNDFXRirrVHRHEPPVTAEPIRLLAENEDVVVVDXPSSIP^ 
GRFRHNTVT F IliGKEHQLKEI^LHPXDRLTSGVIiMFAKTAAVS 
ERIHEQVRDRQLEKE YVCRVEGEFPTEEVTCKEPI LWSYKVGV 
CRVDPRGKPCETVFQRLS YNGQSS WRCR PI/TGRTHQ IR VHLQF 

LGHPILNDP I ynsvawgpsrgrggyi:pktneellrdi*vaehqak 



424 



WO 01/53312 



PCT/US00/34263 



SEQ 
ID 


Predicted 
beginning 
nucieocice 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
iocacion 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Anano acid segment containing signal peptide 
{A=Alanine, C= Cysteine , D=^Aspartic Acid, E= 
wiucarcic ACio, r = riienyiaianine , o=wiycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
I>= Leucine, M=Methionine, N=Asparagine , 
P^Proline, Q=Glutamine, R=Arginine, 
S= Serine , T=Threonine, V«Valine, 
W=Tryptophan, Y~ Tyro sine, X -Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\apossible nucleotide insertion) 








QSLDVLDLCEGDIiS PGLTDSTAPSS ELGKDDI*EELAAAA\QKME 
EVAEAAPQELDTI ALAS E KA VETDVMNQ \ RQT \ TLCR VPAGATG 
SLAPRPCDVPTCPTL \ 


6007 


3 


2351 


HELGQVEYVFTDKTGTLTENEMQFRECS INGMKYQEINGRLVPE 
GPTPDSSEGNLSYLSSLSHLNNLSHLTTSSSFRTSPENETELIK 
EHDLFPKAVSIX^TVQIlWVQTDCTGrXSPVJQSNLAPSQLEYYAS 
SPDEKALVEAAARIGIVFIGNSEETMEVKTLGKLERYKLLHILE 
FDSDRRRMSVTVQAPSGEKLLPAKGAESSILPKCIGGEIEKTRI 
HVDEFALKGLRTLCIA YRKFTS KEYEE IDKRI FEARTALQQR\E 
EKLAAVFQF I EKDLI LLGATAVEDRLQDKVRET I EALRMAG I KV 
WVLTGD KHETAVS VSLS CGHFHRTMN ILEL INQKS DSE CAEQLR 
QLAPJilTEDHVIQHGLVVDGTSIaSIjAI^EHEKLFMEV 
LCCRMAPLQKAKVIRL I KI S PEKP I TLAVGDGANDVSM 1 QEAHV 
GIGIMG KEGRQ AARN SD Y AI AR FKFLS KLIiFVHGHFY Y I R I ATL 
VQY FFYKNVCFI TPQFL YQ FYCLFSQQTLYDS VYLTLY \N I C FT 
SLPILIYSLLEQHVDPHVLQNKPTLYRDISKNRLLSIKTFLYWT 
ILGFSHAFIFFFGSYliLIGKDTSLl^NGCMFGNWTFCTLWTVM 
VITVTVKMALETHFWTWINHLVTWGS I XFYFVFSLFYGGILWPP 
LGSQNMYFVFIQLLSSGSAWFAI ILMWTCLFLDI IKKVFDRKL 
HPTSTEKAQLTETNAG I KCLDSMCCTPEGEAACAS VGRMLERV I 
GRCSPTHISRSV7SASDPFYTNDRS I LTLSTMDSSTC 


6008 


4554 


1089 


AGVRRAGARRG PGRALPAGATAVP P PSARRRRRCPAPEHAG PAR 
ASRPSOETMFQLPVNNLGSLRKARKTVXKIIiSDIGLEYCKEHIE 
DFKQFEPNDFYLKNTTWEDVGLWDPSLTKNQDYRTKPFCCSACP 
FSSKFFSAYKSHFRNVHSBDFENRI LLNCPYCTFNADKKTLETH 
I KIFHAPNASAPSSSIiSTFKDKNKNDGLKPKQADS VEQAVY Y CK 
KCTYRDPLYE IVRKHI YREHFQHVAAP Y IAKAGE KS LNGAVP LG 
SNAREESSIHCKRCLFMPKSYEALVQHVIEDHERIGYQVTAMIG 
HTNVWPRSKPLML I APKPQDKKSMGLP PRIGS LASGNV\RS LP 
SQQMVNRLSI PKPNLNSTGVNMMSS VHLQQNNYGVKSVGQGYSV 
GQSMRLGLGGNAPVS I PQQSQS VKQLLPSGNGRS YGLGSEQRSQ 
APARYSLQSANASSLSSGQLKSPSLSQSQASRVLGQSSS KPAAA 
ATGPPPGirrSSTQKWKICrriCITEliFPENVYSVHF^KEllKAEKVP 
AVANYI MKIHNFTS KCXiYCNR YLPTDTLLNHMLI HG LS CP YCRS 
TFNDVE KMAAHMRMVH I DEEMG PKTDSTLS FDLTLQQGS HTN I H 
LLVTT YNLRDAPAE S VAYHAQNNPP VP P KPQPKVQE KAD I P VKS 
SPQAAVPYKKDVGKTLCPLCFS ILKGP ISDALAHHLRERHQVIQ 
TVH P VEKKLTYKCI HCLGVYTSNMTAST ITLHLVHCRG VGKTQN 
GQDKTNAPSRLNQSPSLAPVKRTYEQMEFPLIiKKRKLDDDSDSP 
S FFEEKPEEP WLALDPKGH \ EDDS YEARKS FLTKYFT \ KQP YP 
TRREIEKIiAASLWV\WK\SDIASHFSNKRXKCVRI)CEKYKP^ 
LGFN>IKEiaTKVKHEMDFDAEGLFENHDEKDSRVNAS KTADKKLN 
LGKEDDSSSDSFENLEEESNESGSPFDPVFEVEPKISNDNPEEH 
VLKVIPEnASESEEKLDQKEDGSKYETIHLTEEPTKLMHNASDS 
EVDQDDWEWKDGAS P SESGPGS QQVSDFEDNTCEMKPGTWSDE 
S SQSEDARS S KPAAKICKATMQGDREQIiKWKNSS YGKVEG FWS KD 
QSQWKNASENDERLSNPQIEWQNSTIDSEJX3EQFDNMTDGVAEP 
MHGSLAGVKLSSQQA 


6009 


4272 . 


1534 


CHGLQHLTPFRELNLSLQG*EPH*AA*QAVRSEEKSIC*GSPSC 
HLVLG VLVP VARQSSHS AG PAQSAFR * TGTG SGTPKAAEQS GYW 

paVn^/iWOMlJMMPPTOT?PPT.\7MK<*3R'RTMCGKCEKG*VSDSVTGG 

RAVAGEQASQRRTVFTAGGGECLGAKS VRAS VFTGNQPGVT4G LL 
NGKRGGCFESG YLFGF 1 VTGKI QSLEAKVPLP VNGQTG ERAS PG 
NCR IHI VDAVC* SEHH* DHFLAAAFLENSTI IS * VAPGSWQDHA 
VLQ KEVQAS VRCRGFES VDTAPAGFWAHS PPGLQGEPTTTSVSL 
EVIJU>QIX3EGVPFVEGQLVTVLGI*VVPQSI RHTFVHHTQLFLHP 
I * KLGALDVAFLHLLTLVCS S FNVAYG *GKNGGTTLHQL FAEVN 
AVTRGSAVQRRPS ITI SSIHVDTKIQQEIiHDVMVAGADGWQWG 
DPFWGLAGI FHLI DDPLHQIELSFQRRV* EQCQGVKPDSQPVP 
RPLRVGLLQVGPLVRGGGRRVAGRGKRCWRDLLFPWRWGLSHRT 
RDLLRGGDRGHVWXVLCRLGSLVGGLGTDELLWFGGR * LI I IG 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corr e sponding 
to first 
amino acid 
residue of 
amino acaa 
sequence 


Amino acid segment containing signai peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Uistidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline , Q=Glutaraine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 

HBl lyptupQaili I - ly£U»4u5| A— unjuiuwii ( 

Codon, /=possible nucleotide deletion, 
\«possi_ble nucleotide insertion) 








I * * RGRLSGE WGCGLGRGELFQVS IGIGVS I VHIGQGDHEVUGG 
AGIiVERGAIJiATG^VEALVOQI^DVGPAGAlX^IXIXSAALFQG P 
GRVG QL P AEG IiQVC I TL VAQ WRMHDG RELGGAE W PWQALHGAAI 
CGVGGAIIJ J KAI^QYFLKGG*RI J WCARGO*PVKKRQRRWRG*TR 
R*NGLTIHCFN* LI * GAVCCRLVI LRWCGLLEVHGVYGT* IHCL 
GS FPGRLWP* PFI SQERPNGHCQWE FRLAVPSWKCRWSRWRVRG 
TWRYGNPIJaNI^+GAWI/3GAACGGQ^X^PI^ 
LP P FQGACR PRTQRCRTWVCPIAVJRQLLAYTRD 


6010 


1 


3533 

• 


IMPCGSSRIiLRGC^rraPNEPVSDLSYFDCIESVMENSKVLGESM 
AGI SQNAKTGDLPAFGECVGI AS KALCGLTEAAAQAAYLVGIFD 
PNS QAGHQGLVD P IQFARANQA I QMACQNLVDPGSS PS QVLS AA 
TI VAKHTSALCNACRIASS KTANPVAKRHFVQSAKEVANSTANL 
VKT I KALDGDFS EDNRNKCRI ATAPL I EAVENLTAFASNPEFVS 
IPAQI SSEGSQAQEPI LVSAKPMLES SSYliIRTARSIAINPKDP 
PTWSVLAGHSHTVSDSIKSLITS I RDKAPGQRE CDYS IDG INRC 
IRDIEQASLAAVSQSLATRDDISVEALQEQLTSVVQEIGHLIDP 
IATAARGEAAQLGHKGTQIiASYFEPLI LAAVGVASKILDHQQQM 
T\njDQTKTLAESAljO^^YAAKEGGGNPKAQHTHDAITEAAQLMK 
BAVDDIMVTI »NEAA5EVGI*VGGMVDAIAEAMS K1*DEGTP PE PKG 
TPVDYG/TT\A7KYSKAIAVTAQEMMTKSVTNPBEIjGGLASQMTSD 
YGHLAFOGQMAAATAEPEEIG FQIRTRVQD1X5HGCIFLVQKAG\ 
ALQVC PTDS YTKR E LI E CARAVTE KVS L VLS ALQAGNKGTQ ACI 
TAATAVSGI I ADLDTTI MFATAGTLNAENS ETFADHRENILKTA 
KALVEDTKLL.VSGAAS T PDKLAQAAQS S AAT I TQLAE VVKLGAA 
SLGSDDPETQWLINAI KDVAKALSD L I S ATKGAAS KPVDDPSM 
YQLKGAAFCVMVTI^TSLLKTVTCAVEDEATKGTRALEATI EC IKQ 
ELTVFQS KDVPEKTSS PEES I RMT KG I TMATAKA VAASNS CRQE 
DVIATANLSRKAVSDMLTACKQASFHPDVSDEVRTRALRFGTEC 
TIXTyiJ>LLEHVLVI IiQKPTPEIJCO^LAAFSKRVAGAVTEI* I QAA 
EAMKGTEWVDPEDPTVIAETELLGAAAS I EAAAKKLEQLKPRAK 
PKQADETLDFEEQ I LEAAKS I AAATSALVKSAS AAQRELVAQGK 
VGS I PANAADDGQWS QGLI SAARWAAATSSLCEAANASVQGHA 
S EEKLI S SAKQVAAS TAQLLVACKVKADQDSEAMRRLQAAGNAV 
KRASDNLVRAAQKAAFGKADDDDVWKTKFVGG I AQI IAAQEEM 
LKKERELEEARKKLAQIRQQQYKFLPTELREDEG 


6011 


446 


1835 


LLQPAP^KSPGI^DCLWAWILLiL»oT1j I v»Kb xQiyJk'^ljyUJWbJvL'iN X 
TVFTRILDRLLDG YDNRLRPGLGERVTEVKTDI FVTSFGPVSDH 
DMEYTIDVFFRQSWKDERLKFKGPMTVLRLNNLMASKIWTPDTF 
FHNGKKS VAHNMTMPNKLLRI TEI3GTI*LYTMRLTVR\AECPMAF 
GRDFPM\D\AHACPLKFGSYAYTRAEVVYEWTREPARSVVVAED 
GSRLNQYDLLGQTVDSGIVQS STGEYVVMTTHFHLKRKIGYFVI 
QTYLPCIMTVI LSQVS FWLNRES VPARTVFGVTTVLTMTTLS I S 
ARNSL PKVAYATAMDW FIAVCYAFVFSAL IEFATVKYFTKRGYA 
WIX5KSWPEKPKKVKDPLIKKIJNTYAPTATSYTPNLARGDPGLA 
TIAKSATIEPKEVKPETKPPEPKJCTFNSVSKIDKLSRIAFPLLF 
GI FNLVYWATYLNRE PQLKAPTPHQ 


6012 


351 


5013 


PAELFQS FAI WHKBLYDWRLG P WNQCQPVI S KSIiEKPLE CI KGB 
EGIQVREIACIQKDKD I PAEDI ICEYFEPKPLLEQACLI PCQQD 
CIVS EFSAWSECSKTCGSGLQHRTRHWAPPQFGGSGCPNLTBF 
QVCQSSPCEAEELRYS LHVGPWS TCSMPHS RQVRQARRRGKNKE 
PFTTOR «5KfiVKDPEAREL I KKKRNRNRONROENKYWD IQ IG YQTR 
EVMCINKTGKAADLS F CQQEKLPMTFQS CV I TKECQVSE WS EWS 
P CS KTCHDMVS PAGTRVRTRTI RQ FP IGS E KECPE FEEKE PCLS 
. O^DG VVPCATYGWRTTEWTECRVDPIiLSQQDKRRGNO/TALCGGG 
IQTREVYCVQANENLLSQI^THKNKEASKPMDLKi PNTT 
QLCHI PCPTECEVS PWS AWGPCT YENCNDQQGKKGFKLRXRRI T 
NEPTGGSGVTG^C^HLLEAIPCEEPACYDWKAVRLGDCEPDNGK 
ECG PGTQVQEWC INS DGEEVDRQLCRDAI F PI PVACDAP CPKD 
CVLSTWSTWSSCSHTCSGKTTEGKQIRARSILAYAGEEGGIRCP 
NS^I^EVRSCNEHPCTVYHWQTGPWGQCIEDTSVSSFNTTTTW 
NGEAS CSVGM QTRKV I CVRVNVG Q VG PKKCP ES LRP ETVRP CLL 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
<A=Alanine , C=Cysteine, D^Asparzic Acid, B= 
Glutamic Acid, F«* Phenyl alanine, G^Glycine, 
H=Histidine, Idaoleucine, K= Lysine , 
L^Leucine, M=Methionine, N=Asparagine , 
P= Proline , Q=Glutamine, R=Arginine, 
S= serine , T=Threonine , V=valine, 
W= Tryptophan , Y=Tyrosine, X= Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








PCKKDCIVTPYSDWTSCPS\SCKEGDSS3RKQSRHRVIIQLPAN 
GGRDCTDPLYEEKACEAPQACQSYRW\KTHKW\HRCQ\LVP\WS 
VQQDS P\GAQEGCGPGRQARAI TCRKQDGGQAG I HE CXiQ YAG PV 
PALTQACQI P CQDD CQ LT S WS KFS S CNGDCGAVRTR KRTIi VG KS 
KKKEKCKNSHIjYPLIETQYCPCDKYNAQPVGNWSDCILPEGKVE 
VLLGMKVQGD I KECGQG YRYQAMACYDQNGRLVETSRCNS HGY I 
EBACI I PC PS DCKLS EWSNWS RCS KS CGSGVKVRSKVTLRE KPYN 
GGRPCPKia>HVNQAQVYEVVPCHSI>CNQYLWVTEPVIS ICKVTFV 
NMRENCX3EGVCyrRJOmCMQNTAIX3PSEHVEDYI^PEEM?LGSR 
VCKItPCP EDCV I SEWG PWTQCVIiPCNQS S FRQRSADP I RQPADE 

KTRMUXTWSIXSKSVDLKYCE 

I^DWSPWSECSQTCGL?GKMIRRRTVTQPFQGDGRPCPSLMDQS 
KPCPVKPCYRWQYGQWSPCQVQEAQCGEGTRTRNISCWSDGSA 

SSWSLCQLTCVNGEDLGFGGIQVRSRPVIIQELENQHLCPEQMI* 
ETKSCYDGQCYEYKWMASAWKGSSRTVWCQRSIXJlTrV^ 
SQPDADRS CNP PCSQ PHS YCSETKTCHCEEG YTE VMS SNS TLEQ 
CTI>I PVVVLPTMEDKRGDVKTSRAVHPTQPSSNPAGRGRTWFLQ 
PFGPDGRLKTWVYGVAAGAFVT*I*I FIVSM I YliACKKP KKPQRRQ 
NNRLKPLTLAYDGDADM 


6013 


1161 


710 

* 


GAFIAG VP VQ PVLIRYPN S LDTTS WAWRGPG VIiKVLWIjTASQP C 
S I VDVE FLP VYHP SP E E S RDPTL YANNVQR VMAQ ALG I PATECE 
FVGSIoPVIVVGRLiKVAIiEPQIj/WGTGKSASEGWAVRWIiCGRWGR 
ARPESNDQPGRVCQAATAI* 


6014 


| 2657 


613 


EAVAGGMEKS RMNLP KG PDTLCFDKDEFMKEDFD VDHFVS DCRK 
RVQLEELRDDIjELYYKLLKTAMVELINKDYADFXVNLSTNLVGM 
DKALNQLSVPLGQLREEVL.SLRSSVSEGIRAVDERMSKQEDIRK 
KKMCVIjRIjIQVTRSVEKIEKI1iNSQSSKETSA1»EASSPI*IjTGQI 
LERI ATEFNQLQFHACQS K\GMPLLDXVRPRI AG ITAMLQQSLE 
GLLLEGtOTSDVDIIRHCl.RTYATIDKTRDAEAiVGQVIjVKPYI 
DE VI I EQFVESHPNGLQVMYNKIiLEFVPHHCR^ 

r\Kl tIM X v zrv^Xlyir xjv v*j VVIryl VvUilDuiUJlr our X» r\jv* ris-r\r ostn a 

TISMDFVRRI^RQCGSO^SVKRIiRAHPAYHSFNKKWNljPVYFQI 
RFRE I AGSLEAAI/TD VLEDAP AES P YCLXiASHRTWS SLRRCWSD 
EMFXPLLVHRLWRLHSGRFWARYSVFV\N\ELSLRPISNES PKE 
IKKPI»VTGSKEPS ITQGNTEDQGSGPSETKPWS rSRTQLVYW 
ADLDKLQEQLPELLiEI I KPKLEMIGFKNFSS I SAALEDSQSS FS 
ACVPSLSSKI I QDLSDS C FGFIJCSAIiEVPRL YRRTNKEVPTTAS 
SYVDSAIJKPLFQLQSGHKDKLKQAI IQQWLEGTIjSESTHKYYET 
VSDVLNS VKKMEESLKRLKQARKTTPANPVGPSGGMSDDDKI RI* 
QLALDVEYIX3EQ I Q KLGLQASDI KS FSALAELVAAAKDQATAEQ 
P 


6015 
> 


13 


2237 


AEG CAERRGT E PWELS M S W E S GAG PGLGS QGMDL VWS AW YG KC 

WGKGSI^PLSARGIVVAWI^RAEWDQVTVYLFCDDHKLQRYAluN 

RITWRSRSGNELPLAVASTADLIRCKXlL^^ 

GMAL VR FVNL. I S ERKTKFAXV PLKCLAQ EVN I PDWIVDLRHELT 

HKKMPH I^^XIRRGCYFVI^WlJQKTYWCRQLEf^S LRETWELEEFR 

EGIEEEDQEEDKNl^rVDDITEQKPBPQDDGKSTESDVKADGDSK 

n<zwv\m <?Hr*K KAliS h KFTYERAREIjIiVS YEEEO FTVLEKFR YLP 

KAI KAWNNPS PRVECVTAEIJCGVTCEiTOEAVLDAFI*DDGFliVPT 

FEQIJUU^IEYEENVDI^VLVPKPFSQFWQPLIiRGLHSQNFTQ 

AlaTiERMl^BLPALGiSGIRPTYTIJlWTVELIVANTK^ 

S ACQ WEARRGWRLFNCS AS LxDWPRMVES CLGS PCWASPQLLR 1 1 

F\KAMGQGLQDE\EQEKLIiRI CSIYTQSGENSLVQEGSEAS pig 

KSPYTLDSLYWS VKPASSS FGSEAKAQQQEEQGS VNDVKBEEKE 

EKE\n^MVEEEEENDDQEEEEEDEDDEDDEEEDRMEVGPFSTG 

QESP TAE NARUaAQ XRGALQG S AWQ VS S EDVRWD TFP \ LG RM P R 

SRPRTPAELMLENYDTHVI FWTKPVL\EQRLEPSTCK\TDTLGL 

\ SCGVGS \GNCSNSSSSNFRGAFIiLEARGSLH \ Gl»\ KTGLQLF 


6016. 


13 


I 2237 


ASGCAERRGTEPVVEI^MSWESGAGPGIXSSQGMDLVWSAWYGKC 
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SEQ 
ID 
NO: 


Predicted. 

beginning 

nucleotide 

location 

cor re sporiding 

to first 

amino acid 

residue of 

amino acid 

sequence 


Predicted end 
nucleotide 
location 
correspondi ng 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D^Aspartic Acid , E«= 
Glutamic Acid. F=Phenylalanine, G=Glycine, 
H-Histidine, I=Isoleucine, K=> Lysine, 
L= Leucine, M= Methionine , N*Asparagine, 
P=Proline, Q=Glut amine , R=Arginine, 
S»Serine, T-Threonine, V= Valine, 
WsTryptopnan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








VKGKGSLPLSAHG I WAWLSRAEWDQVTVYLFCDDHKLQRYAliN 
RITVWRS RSGNELPLAVASTADLI RCKLLDVTGGLGTDELRLLY 
GMALVRFVNL I S ERKT KFAKVP LKCLAQEVNI PDWI VDLRHELT 
HKKMPf^NDCRRGCYFVIJDWIiQKTYWCRQ 

EGI EEEDQEEDKN IWDDI TEQ KPEPQDDG KSTES DVKADGDS K 

KAIKAWNNPS PRV E C VTAEL KGVTCENREAVIJDAFLDDG FLVPT 
FEQLAALOI EYEENVDI^VLVPKPFSQFWQPIJ^GLHSQNFTQ 
ALLERMLS EL P ALG I SG I RPTYI LRWTVEL1 VANTKTGRNARRF 
SAGQWEARRGWRLFl^C£ASIJ)WPRiWESCLGSPCWASPQLLRI I 
F\ KAMGQGLQDE\BQEKI*LRI CS I YTQSGENSLVQEGSEAS P IG 
KS PYTLDSLYWSVKPASSSFGSEAKACQQEEQGSVNDVKEEEKE 
E KEVLPDQ VEEEEENDDQEEEEED EDDEDDEEEDRMEVG PFSTG 
QESPTAENARJjLAQKRGALQGSAWQVSSEDVRWDTFPXLGRMPR 
SRPRTPAELMLEira>THVIFWTKPVL\EQRLEPSTCK\TDTLGL 
\SCGVGS\GNCSNSSSSNFRGAFI^EARGSLH\GL\KTGLQLF 


6017 


2D3 

• 


3469 


SHQE IEQN SAMAPRKRGGRGIS F I FCCFRNNDHPE I TYRLRNDS 
NFALQTMEPALPMPPVEELDVMFSELVDELDLTDKHREAMFAIiP 
AE KKWQI YCS KKKDQE ENKGATS WPE FYXDQLNS MAARKS LLAL 
EKEEEEERSKTIESLKTALRTKPMRFVTRFIDLDGLSCILNFLK 
TMDYETSESRIHTSLIGCIKAI1MITOSQGRAHVI1AHSESINVIAQ 
SLSTENIKTKVAVLEI LGAVC^VPGGHKKVLQAMLHYQKYASER 
TRFQTLINDLDXSTGRYRDEVS LKTA I MS F I N AVLSQGAG VES L 
DFRLHLRYE \ FLMLGIHPVMDKIJiKHENSTIJ>RHIJDFFEMIJlNE 
DELEFAKRFELVHIDTKSATQKFELTRKRLTHSEAYPHFMS ILH 
HCLQMP YKRSGNTVQYWLLLDRI IQQI VIQNDKGQDPDSTPLEN 
FNIKNVVRWLVNEilEVKQWKEQABKMRiCEHNEL 
DAKTQEKEEMMQTIiNKMKEKLEKBTTEHKQVKQQVA^ 
LSRRAVCAS I PGGPSPGAPGGPFPSS VPGSLLPPP PPPPLPGGM 
I^PPPPPI^PGGPPPPPGPPPIXSAIMPPPGAPMGIALKKKS I PQ 

SAYQRQQDFFWSNSKQKEADAIDDTI^SKLKVKELSVIDGRRA 
QNCNIIJ^RLKLSNDEIKRAILTMDEQEDLPKOMLEQLLKFVPE 
KSDIDLLEEHKROSLDRMAKADRFLFEMSRINHYQQRLQSLYFKK 
KFAERVAE VKP KVEAIRSGS EEVFRSGAIJCQLLEVVLAFGNYMN 
KGQRGNAYGFKI SSLNKIADTKSS I DKNI TLLHYL ITIVEN KYP 
SVLNLNEELRDIPQAAJCVNMTE1C 

KSQP PQPGDKFVS WSQ FI TVASFS FSDVEDLLAEAKDLFTKAV 
KHFGEEAGKIQPDEFFGIFDQPLQAVSEAiCQENENMRKKKEEEE 
RRARMEAQIJCEQRERERKMRKAKENSEESGEFDDLVSALRSGEV 
FDKDLSKLKRNRKR ITNQMTDSSRERP I TKLNF 


6018 


13 


2510 


TISQSGGIRRRREAVWFEWNMDFSRLHMYSPPQCVPENTGYTY 
AI^SSYSSDALDFETEHKLDPVFDSPRMSRRSLRLATTACTLGD 
GEAVGADSGTSSAVSliKNRAARTTKQRRS TNKSAFS INHVSRQ V 
TSSGVS YGGTVSLQDAVTRRPPVLDES W I REQTTVDHFWGLDDD 
GDLKGGN KAA I QGNGD VGAG AA1X5HNG FF CSNCNMLS 2R KD VLT 
AHPAAPGPVSRVYSRDRNQKCDDCKGKRHLDAHPGRAGTLWHIW 
ACAGYFLLQILRRIGAVGQAVSRTAWSALWLAWAPGKAASGVF 
WWLGIGVJYQFVTLISWIiNVFLLTRCLRHICKFLVLLIPLFLLLG 
LSLJiGOG\NFFSFLPVLWWASMHRTORVDDPODVFKPTTSRLKO 
PLQGD SEAFP WHWMSGVEQQVAS LSGQ CHHHGENLRELTTLLQK 
IXJARVDQMEGGAAGPSASVRnAVGQPPRETDFMAFHQEHEVR>lS 
HLEDIIX^IOjREKSEA I OKELE^KQKT I S AVGEQLLPTVEHLQL 
ELDQLK5ELSSWRHVKTGCETVDAVQERVDVQVREMVKLLFSED 
QQGGSI^QLLQRFSSQFVSKGDLQTMLRDLQLQIIiRNVTK^^ 
TKQLPTS EAWSAVSEAGASGI TEAQARAIWSALKLYSQDKTG 
MVDFALBSGGGSILSTRCSETYETKTALMSLFGIPLWYFSQSPR 
WIQPDI YPGNCWAFXGSQGYLVVRLSMMIHPAAFTLEHI PKTL 
S PTGNI S SAP KDFAVYGLENEYQEEGQLLGQFTYDQDG2SLQMF 
QALKRPDDTAFQI VELRI FSNWGHPE YT CLYRFRVHGEPVK 


6013 


2 


1066 


TPRDREPPPQRPPSSRRASHLAQEITSAASLGDQTQILGSLTTA | 
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SEQ 
ID 

NO; 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corre sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Al anine, C=Cysteine, D=»Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H^Histidine, Idsoleucine, K= Lysine, 
L=Leucine, M=Methionine, N-Asparagine, 
P=Prol ine , Q=Glutamine , R=Arginine , 
S= Serine, T=Threonine, V= Valine, 
W= Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








P VITSA I R SMPG 2 SSQI LTNAQGQVI GTLPWWNS AS VAAPAPA 
QSLQVQAVTPQTJ.T.fmQGQVXATIiASSPLPPPVAVRK\PSTPES 
LLKSEVQPIKPTPTVPQPAVVIAS PAPAAKPSASAPI PITCSBT 
PTVSQLVS KPHTPSLDEDG INIiEE IRBFAKNFKIRRLSLGLTQT 
QVGQAIiTATEGP AYSQSAI CR FE KLDI T P KS AQKLKP VLEKWLN 
EAEI^NQEGQQNlWEFVGGEPSKXRKRRTSFTPQAIEALNAyFE 
KNPLPTGQEITEIAKELNYDRF^TVRVWFO 
QIP 


6020 


4953 


54 9 


EAIQFEVS IGlWGNKFDTTCia>l4ASTTQYSRAVFDGNYY YYLPW 
Ain^KPVVTLTSYWED I SHRI^AVNTLIiAMAERL^TNI EALKSG I 
QGKIPAN01AE^^KLIDEVIEDTRYTLP^.TEGKA^rV^VLDTQI 
RKXRSRSLSQIHFAAVRMRSEATDVKSTLAEI BDWLDKLMQLTE 
EPQNSKPD 1 1 1 WMIRGEKRIiAYARI PAHQVLYSTSGENASGKYC 
GKTQTI FLKYPQEKHNGPKVPVELRVN I WLGLSAVEKXFNSFAE 
GTFTVFAEMYENQALMFGKWGTSGLVGRHKFSDVTGKI KLKREF 
FIiPPKGWEWEGEWIVDPERSLIjTEADAGHTEFTDEVYQNESRYP 
GGDWKPAEDTYTDANGI)KAASPSELTCPPGWEWEDDAWSYDINR 
AVDEKX5WEYGITIPPDHKPKSWVAAEKMYHTHRRF^RI*VRKRKKD 
LTQTASSTAGAMEELQDQEGWE YASLIGW K.FHW KQRS SDTFRRR 
RVnU^KMAPS ETHGAAAI FKI^GA1X3AOTTEDGDEKSI*EKQKHSA 
TTVFGANTP I VSCKFDRDYT YKLRCYVYQARNLLALDKDS FSDP 
YAHI CFLHRSKTTEI IHSTLNPTWDQTI I FDEVEIYGEPQTVLQ 
NPPKVIMELFDNDOVGia>EFLGRS I FSPWKLNSEMDITPKLLW 
HPVMNGBKACGDVLVTAEIjIIjRGKDGSNLP I LPPQRAPNLYMVP 
QGIRPWQLTAIEILAWGLRNMKNFQMAS I TSPS LWBCGGERV 
ESWIKNLKKTPNFPSSVIiFMKVFLPKEELYMPPLVIKVZDHRQ 
FGRKP WGQCT I ERljDRFRCDP YAGKEDIVPQLKASIJUSAP PCR 
D IVIEMEDTKPLLASKCLS smstalsxmas PATVHLTEKEEEIV 
rnWSKFYASSGEHBKCGQYIQKGYSKLKTYNCELENVABFEGLT 
DFSDTFICLYRGKSDENEDP S WGEFKGS FR I YPLPDDP SVPAPP 
RQFREiPOSVPQECTVRI Y IVRGLELQPQDNNGLCDFY I KITLG 
KKVI E \ DRDHY I PNTLNPVFGRMYEJbS CYLPQEKDLK2SVYD YD 
TFTRDEKVGBTI IDLENPF\LSRFG\SHCG\ IPEEYCVSGVNTW 
RDSLR\ PTQ \ IiLQNVARFKGFPQP ILSEDGSR I RYGGRD YS LDE 
FEANKILHQHIiGAPEERIiAIiHIIiRTQGIiVPEHVETRTtMSTFQP 
NI S \ R YYX»RV I I WNTKDVI LDEKS I TGEEMSD I YVKGWI PGNEE 
NKQKTDVHYRSLDGEX3NFNWRFVFPFDYLPAEQLC I VAKKEHFW 
S I DQTEFRI P PR\LI IQI W \ DNDKFS \ LDDYLGFPRTLTCRHTI 
HFLQ KS PGGNC / RGLDM I PDLKAMNPLKAKTAS L FEQ KSMKGWW 
PCYAEKDGARWIAGKVEMTI^IL^KEADER^ 
KLDLPNRPETSFLWFTNPCKTMKFIVWRRFKWVI IGLLFLLI LL 
LFVAVLLYSLPNYLSMKIVKPNV 


6021 


4953 


549 


eaiqfevs ignygnkfiotcxpij\sttqysravfdgnyyyylpw 
a:^kpvvtltsywedistoldavim,iamaerii<^nif^lksg i 
qgkipanqlaelwlklidbvi edtrytlpltegkanvtvldtqi 
rklrsrs lsqiheaavrmrse atdvkstlae i edwldklmqlte 
epqnsmpdiiiwmirgexrlayaripahqvlystsgenasgkyc 
gktqti flkypqe knngpkvpvelrvni wlglsavekkfns fae 
gtftvfae^ekqalmfgkwgtsglvgrhkfsdvtgki klkref 
flppkgwewegewivdpersllteadaghteptdevyqnesryp 
ggbwkpaedtytdakgdkaaspseltcppgweweddawsydinr 
avdf^gv^ygitippdhkpksvataaekmyhthrrf^lvrkrkkd 
ltqtas staga^elqdqegwe yaslig wkfhwkqrs sdtf'rrr 

RWRRKMAPSETHGAAAI FKLBGALGADTTETCDEKSLEKQKHSA 
TTVFGANTP XVSCNFDRDYI YHLRCYVYQARNLLALDKDS FSDP 
YAHI CFLHRSKTTB I IHSTLNPTWDQTI I FDEVEIYGEPQTVLQ 
NPPKV1MELFDNDQVGKDEFW3RS I FSPVVKLNSEMD ITPKLLW 
HPVMNGDKACGDVLVTAEIjILRGKDGSNIiPIIjP PQRAPNLYMVP 
QG IRPWQliTAIE I LAWGLRNMICNFQMASITSPSLWECGGERV 

eswiknlkktpnfpssvlfmkvtlpkeelympplvikvidhrq 
fgrkpwgqctierldrfrcdpyagxedivpqlkasi>lsappcr 
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SEQ 
ID 
NO: 


Predicted 

beginning 

nucleotide 

location 

c o r re spondi ng 

to first 

amino acid 

residue of 

amino acid 

sequence 


Predicted end 
nucleotide 
location 
corre spondi ng 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A= Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=phenylalanine, G=Glycine, 
H=Histidine, Isslsoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P= Proline, Q=Glut amine, R=Arginine, 
S=Serine, T= Threonine , V»Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, +-3 top 
Codon, /-possible nucleotide deletion, 
\«possible nucleotide insertion) 








DIVIEMBDTKPliIiASKCI^MSTAI»SKMASPATVHLTEKEEEIV 
DWWSKFYASSGEHEKCGQrlQKGYSKLKlYKCELENVAETEGLT 
DFSDTFKLYRGKSDENEDPSVVGE FKGS FRI YPLPDDPSVPAPP 
RQFRELPDSVPQECTVRI YTVRGLEU3PQDNNGLCDPYIKIT1X3 
KKVI E \DRDHYI PNTLNPVFGRMYEIjS CYLPQEKDUd SVYDYD 
TFTRDEKVGETI rDI*ENPF\LSRFG\SHCG\ I PEEYCVSGVNTW 
RDSjbR\PTQ\LLQNVARFKGFPQPII*SEDGSRIRYGGRDYSLDE 
F EANXXLHQHIjGAP eerialh ILRTQGLVPEHVETRTLHSTFQP 
KI S \ RYYLRVT I WKTKDVI LDEKS I TGEEMS D I YVKGW I PGNE E 
NKQKTDWYRSI^EGNFNWRFVFPFDYLPAEQLCIVAKKEHFW 
SIDQTEFRIPPR\LIIQIW\DNDKFS\IjDDYLGFPRTI*TCRHTI 
HFLQKSPGGNC/RGLDMIPDLKAMNPLKAKTASLFEQKSMKGWW 
PCYAEKDGARVMAGKVEMTLBILNEKEADERPAGKGRDEPNMNP 
KLDLPNRPETSFLWFTNPCKTMKFIVWRRFKVrviIGLLFLLirJ^ 
LFVAVLL.YSLPNYLSMKIVKPKV 


6022 


4953 

* 


S49 

• 


EAIQ FEVS IGNYGNKFDTTCKPIiAS TTQYS RAVFDGN YYY YL>PW 
AHTKPWTX»TSYWEDI SHRI*DAVNTtdAMAERLQTNIEALKSGI 
QGKI PANQLAELWL.KLIDEVT EDTR YTLPLTEGKANVTVLDTQI 
RKLRSRSLSQ 1HEAAVRMRS EATDVKS TLAE I EDWLDKLMQLTE 
EPQNSMPD 1 1 I WMI RG EKRLAYAR I PAHQVL YS TSGENASGKYC 
GKTQTI FLKYPQEKNNGP KVPVELRVNI WLGLSAVEKKFNS FAE 
GTFIVFAEMYENQAIjMFGKWGTSGLVGRHKFS DVTGKI klkref 
FLPPKGWEWEGEWIVDPERSLLTEADAGHTKFTDEVYQNESRYP 
GGDVf KPAEDTYTDANGD KAASPSELTC p pg we weddaws ydinr 
AVDE KGWEYG I TIP PDHKPKS WVAAE KMYHTHRRRRLVRKRKXD 
LTQTAS STAGAMEELQDQEG WEYAS L I G WKFHWKQRSSDTFRRR 
RWRRKMAPSETHGAAAI FKLEGALGADTTEDGDE KS LEKQKHSA 
TTVFGANTP I VS CNFDRD Y I YHLRCYVYQARNLLALDKDS FS DP 
YAH I CFLHRSKTTE I IKS TLNPTWDQT 1 1 FDEVE I YG EPQTVIjQ 
NPPKVIMEIiFDNDQVGKDEFLGRSIFSPVVKLNSEMDITPKLIjW 
H PVMNGDKACGDVL VTAEI»I LRGKDGS NLPILP P QRAPNL YMVP 
QGIRPWQLTAI EI LAWGLRNMKNFQMAS ITSPSLWECGGERV 
ESWIKm»KKTPNFPSSVI,FMKVFiPKEELYMPPLVIKVIDHRQ 
KGR KPWGQCT I ERLDRFRCDP YAGKEDI VPQLKAS LLSAPPCR 
D I VI EM EDTKPKLASKCLS SMS TALSKMAS PATVHLT2KEEE I V 
DWWSKFYASSGEHEKCGQYIQKGYSKLKIYNCEIiENVAEFEGLT 
DFSDTFKLYRGKSDENEDPSVVGEFKGSFRIYPLPDDPSVPAPP 
RQFIIELPDSVPQECTVRIYIVRGIjELQPQDNNGLCDPYIKITLG 
KKVIE \ DRDHY I PNTLNPVFGRMYELS CYT.PQEKDLKI SVYD YD 
TFTRDEKVGETI IDI*ENPF \ I»SRFG\SHCG \ I PEEYCVSGVNTW 
RDSliR\PTQ\LIjQNVARFKGFPQP UjSEDGSRIRYGGRDYSI*DE 
FEANKILHQHLGAPEERIALHII^RTQGLVPEHVETRTLHSTFQP 
NIS\RYYLRVI1WNTKDVILDEKSITGEEMSDIYVKGWIPGNEE 
NKQICTDVHYJtSLDGEGNFNWRFVFPFDYl*PAEQLCIVAKKra 
S IDQTEFRIPPR\LI IQIW\DNDKFS\I*DDYLGFPRTIjTCRHTI 
HFLQKS PGGNC/RGI*DMI PDLKAMNPLKAKTAS LFEQKSMKGWW 
PCYAEKDGARVMAG KVEMTLE I LNEKEADERPAGKGRDE PNMN P 
KLDIiPNRPETS FLW FTN PCKTMKFIVWRR FKWVI IGLLFLLILL 
LFVAVLLYSLPNYLSMKIVKPNV 


6023 


102 


916 


S OELGMF VEI*NNLiNTTPDRAEQG KLTL LCDAKTEKSS FJjVHHFI* 
S FYLKANCKVCFVAIiI QSFSHYS I VGQKLGVSLTMARERGQLVF 
LEGL/ rVCSGR\VFQAQKEPHPLQFLREANAGNLKPLFE FVREA 
LKPVDSGEARWTYPVLLVDDLSVLI^U?MGAVAVIJDFIHYCRAT 
VCWELKGNMVVLVHDSGDAEDEENDILIiNGLSHQSHLILRAEGL 
ATGFCRDVHGQUR I LWRRPSQPAVHRDQS FTYQ YKIQDKS VS FF 
AKGMSPAVL 


6024 


3 


3260 


FI^FlX^PRFRCiFCXQFAIPASRMEQLNEI^LLMEKSFWEE^E 
LPAELFQKKVVASFPRTVLSTG^I^YIiVTjAW 
RIiVITASQSI*ENKELCIIiRNDWCSVPVEPGDI IHLEGDCTSDTW 
1 1 DKDFGYL 1 1* YPDMIiI SGTS IAS S I RCMRRA VLS ETFRS S DPA 
THQMklGTVliHEVFQKAINNSFAPEKL^ 
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SEQ 
ID 
NO: 


Predicted 

beginning 

nucleotide 

location 

co rr e spending 

to first 

amino acid 

residue of 

amino acid 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
tA= Alanine . C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F-Phenylalanine, G-Glycine, 
H=nistidine, I=Isoleucine, K=I*ysine. 
I*=l*eucine, M=Methionine , N=Asparagine, 
P= Proline, Q=Glutamine, R=Arginine , 
S=Sexine , T=Threonine, VWaline, 
W=Tryptophan, Y= Tyrosine, X-Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








YRLNUSQDEIKQEVEDYLPSFCKWAGDFMHKNTSTDFPQMQIjSL. 
PSDNSKDNSTCNIEWXPMD I EES I WS PRFGLKGK1 DVTVGVKI 
HRGYKTKYKIMPLELKTGKESNS I EMRSQVVLYTLLSQERRADP 
EAGLLLYLKTGQMYPV^ANHIJDKRELLKXJ^ 
ATRQKTQIASLPQI IEEEFCTCKYCSQIGNCALYSRAVEQQMDCS 
S\fl?IVMLPKIEEETQHI#KC/nn^YFSLW 

HQNIWI^PASE>IEKSGSCIGNIjIRhIEHVXIVCDGQYLHNFQCKH 
GAI PVTNI^IAGDRVIVSGEE^SLFALSRGYVKEINMTTVTCI^LD 
RNliS VLPESTLFRLDQEEKNCDI DT P LGNLS KLMENT FVS KKLR 
DLI I DFREPQF I S YLS S VLPHDAKDT v AC I LKGLNKPQRQAM KK. 
VLLS KDYTIi I VGMPGTG KTTTI CTLVRI L YACGFS VLLTS YTHS 
AVDNI LLKLAXFKIGFLRSR\Q I QKVHPAIQQFTEHE ICRSKS I 
KS \LALLEELYTSQLI DATTCMGINHPI FSRKI FDFCI VDEASQ 
ISQP I CLGPLFFSRRPVLVGDHQQLP PLVLNREARAI/3MS ESLF 
KRLEONKSAWQLTVQYRMNSKIMSLSNKIjTYEGKI^CGSDKVA 
bCyVZNLRHFKDVKLEIiEFYADYSDNPWLMGVFEPNNPVCF^ 
KVPAPEQVEKGGVSNVTEAKIjI VFI*TS I FVKAGCS PS D I G I IAP 
YRQQLKIIl!roi*IiARSIGMVEVNTVDICYQD\RDKS IVLVSFVRSN 
KDGTVGEXjIiKDWRRT JtfVAI TRAKHKL I LLGCVPSLNCYPPLEKL 
LNHLNSEKL 1 1 DLPSREHES LCH I LGDFQRE 


6025 


3977 


89 


GGFPAQSDHLPPVFPLRSDLLITMSTLYVSPHPDAFPSLRALIA 
ARYGEAGEGPGWGGAHPRICIjQPPPTSRTSFPPPRLPAliEQGPG 

glwvwg atavaqllwpaglgg p g gs raavl»vqq WVS YADTE t*I P 

AACGAT LP ALGLR SS AQD P Q AVLGALGRALS PLEE WLRLHT YLA 

GEAPTLADIAAVTAI^I*PFRYVIiDPPARRIWNNVTRW 

PEFRAVLGEVVLYSGARPLSHQPGPEAPALPKTAAQLKKEAKKR 

EKLBKFQQKQKI QQQQPPPGEKKP KP E XREKRDPGVI TYDLPTP 
PGEKKDVSGPMPDSYSPRYVEAAWYPWWEQQGFFXPEYGRPNVS 
AAI^RGVF^^CIPPPNVTGSLHI/SHAIjTNAIQDSIjTRWHRMRGB 
TTLWNPGCDHAG IATQVVVEKKLWREQGLSRHQLGREAFLQEVW 
KWKE EKGDRI YHQLKKLGS S LDWDRACFTMDPKLS AAVTEAFVR 
IiHEEG I I YRSTRLVNWS CTLNSAISD I EVDKKELTGRTLLSVPG 
YKEKVE FGVLVS FAYKVQG S DS DEEVWATTR I ETMLGDVAVAV 
HPKDTRYQHLKGKNVIHPFLSRSLPIVFDEFVDMDFGTGAVKIT 
PAHDQND YEVGQR HGLEAI S I MDSRGAL I NVP PP FLGLPRFEAR 
KAVLVALKERGLFRGI FXINPMWPLCNRS KDWE PLLRPQWYVR 
CGEMAQAASAAVTRGDIJiILPERHQRTWHAWMDNIRE\WCMFPG 
KIWWG\HR\ IPAYFVTVSDPAVPPGEDPDGRYWVSGRNEAEARE 
KAAKEFGVS PDKI S LQQDEDVLDTW FS S GLFPLS I LG WPNQS ED 
I^VFYPGTLLETGHDILFFWVARMvMIjG KaVjiiOn. 
IVRI3AHGRKMSKSIjGNVIDPLDVIYGISIjQGLHNQLLNSNI»DPS 
EVEKAKEGQKADFPAGIPECGTDAI*RFGtiCAYMSQGRDINLDVN 
R I LGYFJf FCNKL WNATKFALRGLGKGF VPS PTSQPGGHESLVDR 
WIRSRLTEAVRI*SNQG FQAYDFPAVTTAQYSFWLYELCDVYLEC 
LKPVLNGVDQVAAECARQTLYTCIJ3VG^ 

RLPRRMPQAPPSLCVTPYPEPSECSWKDPEAEAALEIiAIiSITRA 
VRP \ LRADYNLHPESG PTCFLEVAD\EATGALASAVS G YVQG PG 
QAQVWAVAEPWGLPAP\C^CAVAIASDRCS I \HLQLQG\LLDP 
ARELG\KI*Q\ AKRVEAQ\RQAQ\ RIjR\ERRA\ ASGNP VKVPI»\E 
VQEADEAIOX2QTEAELRKVDEAIALFQKML 


bUZo 




jit 


GPITFLKKKAKNnO^MPLRIHVLLGIJ^TTIjVQAVDKKVDCPRIiC 

TCEIRPWFTPRSIYMEASTVDCNDI^IJ^TFPARLPANTQILIiLQ 

TNNIAKZETSTDFPVNliTGI^LSQNNr>SSWNrNGKKMPQO 

YLEENKljTEl.PEKCIiSELSNIiQELYTNHNIiSTISPGAFIGLiHW 

I^JU^HXjNSNRI*CWI1JSKWFDALPNLEILMIGENPI 

PL INLRS LVXAG 1NLTE I PDNALVGLENLES ISFYDNRLI KVPH 

VAI^KVVNLKFTJDUmJPINRIRRGDFSNMLHLK^M 

I S IDSLAVDNLPDLRKI EATNNPRLS YIHPNAFFRLPKLESLML 

NSNALSALYHGTIESLPNLKEISIHSNPIRCDCVIRWMNMNKTN 

IRFMEPDSLFCVDpPEFQGONVRQVTIFRDMMEICLPLIAPESFP 

SlHiNVEAGSYVSFHCRATA\EPQPEIYWITPSGQKIiLPNT\LTD 
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ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid, segment containing signai peptide 
(A= Ala nine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine , G=Glycine, 
H=Histidine, I-Isoleucine, K=Lysine, 
L=Leucine / M= Methionine, N=Asparagine, 
P=Proline, Q=Glut amine, R=Arginine, 
S=Serine, T=Threonine, V=Valine. 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, j 
\=possible nucleotide insertion) j 








: KFY VHSEGTI^INGVTPKEGGL»YTCI ATNIiVGADLXSVMI KVDG 
SFPQDNNGSLNI KI RDIQANSVLVSWKASSKILKSSVKWTAFVK 
TENSHAAQSARIPSDVKVYNIiTHLNPSTEYKICIDI PTIYQKNR 
XKCVNVTTKGIiHPIXJKEYEICNNTTTIjMA^^ I IGVI CLI S 

CI^PEr^CDGGliSYVRNYIXJKPTFALOTLYPPLINLWEAGKEKS 
TSLKVKATVIGLPTNMS 


6027 


5254 


4148 


GGPJRAPGRPGRS I KDE EE ETV FREWS FSPDPLPVRYYDKDTTK 
PISFYLSSI£EI^WKPIU*BDGFNVAIiEPLACROPPLSSQRPST 
IJliCHDr^GGYIiDDRFIQGSWQTPYAFYHWQCIDVFVYFSHHTV 
TI PPVG WTNTAHRHGVCVLGTF I TEWNEGGRLCEAFLAGDERS Y 
QAVADRLVQIT\RFFRFDGWIiINlENSLSLAAVGNMPPFLRYLT 
TQLHRQ V PGGI»VLVrYDS VVQSGQLKWQDELNQHNRVFPDSCDGF 
FTNYNWREEHLEPJ«l^AGSRJiADVWGVDVPARGNWGGRroT 
DKVGGGFRPRASGPVPPDGPHFLMDLPFPSAPQRNDSSCSSQSG 
DPVALRNRCPAPAKIjCPH 


6028 


120 


3432 


NCLLlEjQAKGFHGE I EDLQQWLTDTERHLLASECPLGGL.PETAKEQ 
LNTVHMEVC^AFEAKEETYKSIxMQK^ 

NNIiKEKWES V BTICLNE R \ KT\ KLEEALNI*A\MEFHNS L\QDFIN 
WliTQAEQTLNVASR P S LILDTVLFQIDEHKVFANEVNSHREQ 1 1 
EIjDKTGTHIJCYFSQKQDVVL IKNLI*ISVQSRWEKWQRI>VERGR 
S1JDDARXPAXQFHEAWSKLMEWLEESEKSLDSELEIANDPDKZK 
TQLAQHKEFQKSIiGAKHSVYiyrTNRl 

DMLS2LRDKWDTI CX3KSVERQNKLEEA\LI»FSGQFTDALQALID 
WLYRVEPQLAEDQP VHGDI DLVMNL I DNHKAFQ KELG KRTSSV Q 
ALKR S ARE L I EGSRDDS SWVKVQMQELSTRWETVCALS I SKQTR 
LEAAI^QAEEFHSVVHAIiLEWIA.EAEC^RFHGVLPDDEDALRT 
LIDQHKEFMKKLEEKPAELNKATTMGIDTVLAI CHPDS ITTI KHW 
IT! I RARF E E VXAWAKQHQQRLAS ALAGL I AKQ ELLEALLAWLrQ 
KAETTLTDKDKEVI PQE I E EV KAL IAEHQTFME EMTRKQ P DVDK 
VTKTYKRRAADPS SLQSHI P VLD KG RAGRKR FP AS S LY PSG S QT 
QI ETKNPRVNUjVS KWQQV>njIAIiERRRKLNDAI»DRI*EEIjREFA 
NFDFDIWRKKY^WMNHKKSRVMDFFRRIDKDQDGKITRQEFID 
G I LS S KFPTSRIiEMSAVAD I FDRDGDG YID YYE FVAALHPNKDA 
YKPITDADKI EDEVTRQVAKCKCAKRFQVEQIGDNKYRFFLGNQ 
FGDSQQLRLVRILRSTVMVRVGGGWMALDEFLVKNDPCRAKGRT 
NKBI*REKFIIADGASQGMAAFRPRGRRSRPSSRGASPNRSTSVS 
SQAAQAASPQVPATTTPKI LHPLTRNYGKPWLTNS KMS TPCKAA 
ECEDFPVPSAEGTPIQGSKLRLPGYLSGKGFHSGEDSGI.ITTAA 
ARYRTQFADS KKTPS RPGSRAGS KAGSRASSRRGSDASDF0 1 SE 
IQSVCSDVETVPQTHRPTPRAGSRPSTAKPSKIPTPQRKSPASK 
IsDKSSKR 


6029 


1 


3533 


IMPCGSSRLI/RGCWTHPJJEPVSDiSYFDCIE^VME^SKVIiGESM 
AG I S QMAKTGDLPAFGE CVG I AS KALCGLTEAAAQAAYLVG I FD 
PNSQAGHQGLVDP IQFARANQAI QMACQNLVDPGS SPSQVLSAA 
TI VAKHTSALCNACRI AS S KTANPVAKRHFVQSAKEVANSTANL 
VKT I KALDGDFS EDNRNKCR IATAPL I EAVENLTAFASNP E FVS 
I PAQIS SEGSQAQEP ILVSAKPMU3SSS YLIRTARSLAINPKDP 
PTWSVIiAGHSHTVS DS I KSL I TS IRD KAPGQPJSCD YS IDGINRC 
IROIEQASLAAVSQSLiATRDD IS VEALQEQLTSWQEIGHLIDP 
I ATAARGEAAQLGHKGTQLAS YFEPL I LAAVGVAS K ILDHQQQM 
TVLDQTKTIAESALQML YAAKEGGGN PKAQHTHDAI TEAAQIiMK 
EAVDDIMVTLNEAASEVGIiVGGKVDAIAEAMSKLDEGTPPEPKG 
TFVDYQTTWKYSKAIAVTAQEMOTKSVmPEELGGLASQMTSD 
YGHLAFQGQMAAATAE PEE I GFQ I RTR VQDLGHG C J FLVQKAG\ 
ALQVCPTDS YTKREL1 ECARAVTEECVSLVLSALCJAGNKGTQACI 
TAATAVSGI I ADLDTT I MFATAGTLNAENSETFADHRENTLKTA 
KAL VEDT KLiL VSGAAS T PDKLAQAAQSS AAT I TQLAE WKLGAA 
SLGSDDPETQWItlNAIKDVAKALSDLISATKGAASKPVDDPSM 
YQLKGAAKVMVTNVTSLLKTVKAVEDEAT^ 
EIiTVFQSKDVPEKTS S PEES I RMTKGITMATAKAVAAGNSCRQB 
D VIATANLS RKAVSDMXiTACKQAS FHPDVSDEVRTRALRFGTEC 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
correspondi ng 
to first 

residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
s i due o f 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A= Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenyl alanine, G-Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L- Leucine , M-Methionine , N=Asparagine , 
P= Proline, Q=Glut amine, R=Arginine, 
S«Ser ine , T=Threonine , V=Val ine , 
W=Tryptophan, Y=Tyxosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








TLGYLDI*LEHVI>VI LQKPTPELKQQLAAFS KRVAGAVTELIQAA 
EAMKCTEWVDPEDPTVIAETBI»IjGAAASIEAAAKKI>EQL>KPRAK 
PXQADETLDFEEQILEAAKS IAAATSALVKSASAAQRELVAQGK 
VGS I PANAADIX3QWSQGLrSAARMVAAATSSLCEAANASVCX3HA 
S EEKLI S S AKQ VAAS TAQLL VAC KV KADQ DSEAMRRLQAAGNAV 
KRASDNIiVRAAQKAAFGKADDDDWVKTKFVGGIAQI IAAQEEM 
LKKERELE EARKKLAQ I RQQQ YK FLPTELREDEG 


6030 


3 


1777 

• 


FPGRGSPALQLEVLI O/GLMGLERALNVLAPI FYRNIVNLLTBN 
APWNSIJLWTVTSYVT^KFLQGGGTC 

TSRRVEIiLI FSHLHEI*SLRWHIiGRRTGEVLR I ADRGTSSVTGLL 
<; VT .VFNVT P*TT .RDT TTGTT YFS MFFNAW FGLI VFLCMSLYLTLT 
IVVTEWRTKFRRAMirrQENATRARAVI>SIiIjNFETVKYYNAESYE 
VKRYREAI I KYQGLEWKSSASLVlJ^OTQNLVIGLGLIiAGSLIiC 
AYFVTEQKLQVGDYVLFGTYT IQLYMPLNWFGTYYRMIQTNFID 
MENMFDLLKK\STEVKDLPGAG P FRFQKGRI E FENVHFS YADGR 
BTLQDVSFTVMPGO^TLA1jVGPSGAGKSTILRI»LFRFYDISSGCI 
RI DGQDI SQVTQALFRFSHWE3LCPKDTVLFNDTIADN I RYGRVT 
AGND E VEAAAQAAG IHDAI MAF PEG YRTQ VG ERGUCLS GGEKQR 
VAIARTIIiKAPGI ILLDEATS ALDTSNERAI QASLAKVCANRTT 
I WAHRLSTWNADQILVI KDGCI VERGRHEALLSRGGVYADMW 
QLQQGQEETS EDTKPQTMER 


6031 


160 

> 


1694 


LRMS ENLDKS NVNEAG KS KSNBS EEGLEDAVEG ADEALQ KAI KS 

nf> n p n/^m TAD Ml C? P t>t)D m JfWUPT T tTTA'DmTT'WMIYT'.ZVWFT WTOfr 
UooorU KVyrvJrxlfeoJcrr'iCr V 1 V Hi Hi LjLjUj k .rVrCvj v ijnn/umnriJL v vi*V7 

DFQIKPVEIjPENSLKKRVKEIVHKAFWDCLSVQLSEDPPAYDHA 
I KLVGE I KETLLS FLL PGHTRLRNQ I TEVLDLDL I KQEAENGAL 
D I S KLAEFI IGMKGTLCAPARDEE VKKTiKD I KE rVPLFRE I FS V 

LDLMKVDMANFAISS irphlmqqsveyerkkfqei lerqpnsld 

FVTQWLEEASEDI^TQKYKHAIjPVGGMAAGSGDMPRLSPVAVQN 

yaylkllkwdhlqrpfpetvlmdqsrfhelqlq\reqltilgav 
llvtfsmaapgissqadfaeklkmivkiiatdhhlpsfhlkdvi* 
ttigekvclevssclslcgs s pfttdketvlkgqiqavas pddp 
irri mesriltfletyiias ghqkplptvpggls pvqreiibevai 
kfarlvnynkmvfcpyybail.skilvrs 


6032 


39 


2415 

• 


AARLCRAQPTKSAWMIRDLSKMYPQTRHPAPHQPAQPFKFTISE 
SCDRIKEEFQFLQAQYHSLKLiECEKLASEKTFJvlQRHYVMYYEMS 
YGLNI EMHKQAEI VKRLNAI CAQVI P FL.SQEHQQQ WQAVERAK 
QVTMAEIiNAIIGQQQI>QAQHI*SHGHGI*PVPl,TPHPSGLQPPAIP 
PIGSSAGLLALSSAIX^QSHLPIKDEKKHHDNDHQRDRDS I KSS 
SVSPSAS FRGAEKHRNSADYSS ESKKQKTEEKEIAARYDSDGEK 
SDDNLWD VSNEDPS S PRGS PAHSPRENGLDKTRLIjKKDAP ISP 
AS IAS SS STPS S KS KELSLNE KSTT PVS KSNTPTPRTDAPTPGS 
WSTPGTjR PVPGKPPGVDPLASSLRTPMAVPCPYPTP FGI VPHAG 

MNGELTS pgaayaglhnispqmsaaaaaaaaaaaygrs pwgfd 

PHHHMRVPAI PPNI*TGI PGGKPAYSFHVSADGQMQPVP FPPDAI* 

IGPG I PRHARQINTXjraGEWCAVTISNPTPJJ\TYTGGKGCVKVW 

DISHPGITKSPVSQI^CLNRDNYIRSCRLLPDGRTLIVGGEASTL 

S IWDLAAPTPRIKAELTSSAPACYAIiAISPDSKVCFSCCSDGMI 

AVWDLHNQTLVRQFO^SHTDGASCIDISNDGTKLVn^ 

W \DLREGRQIjQQHD / FFTS PVFSLGYCP \TEEWIAVGMENSN\ V 

EVLHVTKPDKYQLHI^ESCVI^LKFAHOT 

W\RTPYG\ASIF\QSKESSS\VLSC33I\SVDDKYIVTGS\GDK\ 
RATVYEVIY 


6033 


39 


2415 


AARLCRAQPTKSAWM I RDLS KMYPQTRHP APHQPAQP FKFT I S E 
SCDRIKEEFQFLQAQYHSLKLECEKLASEKTEMQRHYVMYYEMS 
YGLNI EMHKQAEI VKRIiNAI CA.OVIPFT1SQEHQQQ WQAVERAK 
QVTMAELNAI IGOQQLQAQHLSHGHGLPVPLTPHPSGLOPPAI P 
P I GSSAGLIiALSSAIjGGQSHLP I KDEKKHHDNDHQRDRDS I KS S 
S VS PSAS FRGAEKHRNSADYS S ES KKQKTEEK3 IAARYDS DGB K 
SDDNLWDVSNEDPSS PRGSPAHS PRENGLDKTRLLKKDAPIS P 
AS IASSSSTPSSKSKELSLNEKSTTPVS KSNTPTPRTDAPTPGS 
NSTPGLJIPVPGKPPGVDPIjASSIxRTPMAVPCPYPTPFGIVPHAG 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


| predicted end 
nucleotide 
location 
c orre s ponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid. segment containing signal peptide 
(A=Alanine , C=Cysteine, D-Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H«*Histidinc, I=Isoleucine, K=Lysine, 
L=»Leucine, M=Methionine, N-Asparagine, 
P=Proline, Q=Glutainine, R=Arginine, 
S=Serine, T=Threonine , V-Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknovn, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








MNGELTS PGAAYAGIjHNIS PQMSAAAAAAAAAAAYGRSP WG FD 
PHHI LMRVP AI PPNLTG I PGG KP A YS FHVS ADGQM Q P VP FP P DAli 
IGPGlPRHARQINTIiNHGEWCAVTISNPTRHVYTGGKGCVK^ 
DISHPGNKSPVSQLDO^numil^aU^LPDGP/rijIVGG^ASTL 
S I WDLAAPTPR I KAELTSSAPACYALAI S PDSKVCFSCCSDGNI 
A VWDLfttJQTLf VRQ FQGHTDGAS C I DI S >HX3 TKXi WTGG LDNTVRS 
W\ DI*REGRQLQQHD/ FFTS PVFSLGYCP \ teewlavgmensn \ V 
EVIjHVTKPDKYQI^IJIESCVIiSLKFAHC^ 

W\RTPYG\AS IF\QS K3SSS \ VLSCDI \SVDDKYI VTGS\GDK\ 
RATVYEVIY 


6034 


2683 


714 


ESGRRRR1>KRRRS PC PGTAGG pgetn PG PG acprg preeaaaam 
EIAPQEAPPVPGADGDIEEAPAEAGSPSPASPPADGRLKAAAKR 
VT F P SDED I VSGAVE P KD P WRHAQISIVTVDE VI GAYKQACQ KLNC 
RQI PKLLRQLQEFTDI/5HRLDCI»DLICGEK3JDYCTCEAI*EEVFKR 
LQFKVVDLEOTNLDEDGASALFDMIEYYESATHLNISFNKHIGT 
RGWQAAAHKMRKTSCLQYLXOARNTPIjIiDHSAPFVARALRIRSS 
IAVLHLENASLSGRPLf4LIATALKMNKNIiR 

DSAQLGNLLKFNCSLQI LDLRNNHVLDSGLAYI CEGLKEQRKGI* 
VTlAVLWNNQI,THTGMAFIOm,PHTQ^ 

RHLKNGLI SNRS VLRLGLASTKLTCEGAVAVAEF I AES PRLLRI> 
DLRENE I KTGGLMALS LALKVNHS LLRLDKDRE PKKEAVKS FIE 
TQKALLAE IQNGCKRNLVLARBREEKEQP PQLS ASMPETTATEP 
QPDDEPAAGVQNGAPSPAPSPDSDSDSDSDGBEEEEEEGERDET 
PSGAIDTRDTGSSEPQPPPEPPRSGPPIiPNGLKPEFAlALPPEP 
PPGPEVKGGSCGIiEHELSCSKNEKELEELIjLEASQESGQETI; 


6035 


19 


4 04 


SVTYLGI IIiHKNTGAIiPADPVQIilSQTPTPSTKQQIjLSFIiGMVG 
YFYIiWIPGFAILTKPLCKLTKENLADAIDPKSFSHSSFRSIJCTA 
LENASTLAJLPDSSQPF\SLHTAEVQGCVVEI LTQGLGPLPV 


6036 


1745 


356 


LPDVEKIX5RRRGRKMDSVEKGAATSVSNPRGRPSRGRPPKLQRN 
SRGGQGRGVEKPPHIiAAIillARGGSKGIPLKNIKHIiAGVPLIGW 
VLRAALDSGAFQSWVSTDHDEIENVAKQFGAQVHRRSSEVSKD 
SSTSLDAIIEFlJTYHNEVBIVGNIQATSPCIiHPTDIiQKVAEMIR 
EEG YDSVFS WRRHQFRWSE I QKGVREVTEPLN LNPAKRPRRQD 
WDGELYF^GSFYFAKRHLIEMGYLI^GKMAYYEMRAEHSVDIDV 
DIDWP I AEQRVX.R YG Y FGKEKLKE I KLLVCN I DG CLTNGH I YVS 
GDQKE I IS YDVKDAI G I S LLKKSG IEVRX I S ERACS KQTLS SLK 
LDOCMBVS VSDKIiAVVDEWRKEMGLCW KEVAYLGNEVSDEECLK 
RVGLSG APADACSTAQKAVG Y I CKCNGGRGA\ I RE FAEH IC\LL 
MEKGLINFMPKNRNLAVNIGEKK j 


6037 


2936 


1919 


WTSWWMSSVLTI UjFSIiQGNKMLNYSAPSAGGYIiJjPRKPVGTPA 
GGGFPRRHSVTLPSS KFRQNQLLS SLKGEPAPALS SRDSRFRDR 
SFSEGGERLI*PTQKQPGGGQVNSSRYKT\ ELCRP FEENGACKYG 
DKCQFAHGIHELRSLTRHPKYKTELCRTFHTIGFC P YGPRCHFI 
HNAEERRALAGARDLSADRPRLQHS FS FAGFPSAAATAAATGLL 
DS PTS ITPPP I I*S ADDLLGS PTLPDGTNNPF\AFS SQELAS LFA 
PSMOLPGGGSPTTFLFRPMSESPHMFDSPPSPQDSIiSDQEGYI*S 
SSSSSRSGSDS PTLDNSRRLP I FSRLS I SDD 


6038 


1450 


426 


SSAIiQEFGTRNHTFGVPL»PHRRKQI I SCNICQLRFNSDSQAAAH 
YKGT KKAKKLKALEAMKNKQKS VTAKDSAKTTFTS I TTNT INTS 

NSSCPS TETEEB KAKRLIA YCSI*CKVAVNS ASQLEAHNSGTKHK 
TMLEARNGSGTIKAFPRAGVKGKGPVNKGNTGIjQNKTFHCEICD 
VHVNSETQLKQHISSRRHKDRAAGKPPKPKYSPYNKLQKTAHPL 
GVKLVFSKEPSKPIAPRILPNPIAAAAAAAAVAVSS PFSIiRTAP 
AATIJFQTSALPPALLRPAPGPIRTAHTPVLPAPY 


6039 


4073 


1000 


LDE YEARLTIAKLDDFEEDNEDDDENRVWQEEKAAKI TELI NKL 
NFUDEAEKDLATVNSNPFDDPDAAELNPFGDPDSEEPITETASP 
RKTEDSFYNNSYNPFKEVG^QYloNPFDEPEAFVTIKDSPPQST 
KKKN IRPVDMSKYIjYADSSKTEEEELDESNP FYEPKSTPPPNNL 
VNPVQEL»ETERRVKRKAPAPPVLS PKTGVLNENTVSAGKDLSTS 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


| Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A= Alanine, (^Cysteine , D=Aspartic Acid, E= 
Glutamic Acid, F=Phenyl alanine , G=Glycine, 
H=Histidine, I=Isoleucine, K=I*ysine, 
L=Leucine, M= Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T= Threonine, V-Valine, 
W= Tryptophan , Y^Tyrosine, X -Unknown, *=Stop 
Oodon, / = possible nucleotide deletion, 
\=»possible nucleotide insertion) 




• 




PKPSPIPSPVLGRKPNASQ^KbVWCKEVTK}TC^ 
RNGI>SFCAILHHFRPDLIDYKSLNPQDI KENNKKAYDGFASIGI 
SRIxLEPSD^7VLLAIPDKIlWMTYLyQIIyu^SGQEI 4 WVQIEEN 
S S KS TTKVGNY^TDTNSSVDQEKPYAELSDIiKKEPELQQ P X5GA 
VDFLS QDDS VFVNDSGVGESRS EHQTPDDHLS PSTAS P YCRRTK 
S DTE P QKS QQS S GRTSGS DD PG I CSNTDS TQAQ VLLGKKRLLKA 
ET1»EI*SDI,YVSDKKKDMS PPF I CEETDEQKLO/TLDIGSNLEKEK 
LENSRSLE CRSDPES P I KKTS LS PTS KXG YS YSRDLDliAXKKHA 
SLRO/TESDPDADRTTIiiniADHSSKIVQHRIJjSRQEELKERARVIf 
LE QARRD AAL KAGNKHNTNTAAP F CNRQ LS DQQD EERRR Q LRER 
ARQI,IAEARSGGKMSELPSYGERAAEKLKERSKASGDENDNIEI 
DTNEEI PEGFWGGGDELTNLENDLDTPEQNSKLVDLKLKKLLE 
VQ PQVANS P S S AAQKAVTES S EQDMJCS GTE D LRTERLQ KTTERF 
RKP WFSKDS T VRKTQLQS FSQ Y I ENRPEMKRQRS IQEDTKKGN 
EE KAA I TETQRXPSEDEVLNKG FKDS \ SQ YWGELAALENEQKQ 
IDTRAALVEKRLRYLMDTGRNTEEEI^^ j 
RKNQLSLIiEKEE^LEilRYEIj£jNREXjRAMIiAI EDWQKTEAQKRRE 
QTiTJ.DEZjVALVNKRDALVX D J .DAQEKQAEEEDEHLKRTLSQNKG 
KMAKKEEKCVLQ 


6040 


475 


1052 


PTALMTAPS CAF P VQFRQPS VSGLSQ I TKSL Y I SNGVAANNKLM 
LSSKQIT^WItn/sVEVVNTLYEDIQYMQVPVADSPNSRIiCDFFD 
PIADHZHSVEMKO^R\TLLHC^^GVSRSAAI»CIjAYXiMKYHAMSIi 
LDAKTWTKSCRP I IRPNSGFWEQL IHYEFQIiFGKNTVHM VSS PV 
GMIPDIYEKEVKLMIPL 


6041 


2 


3886 


TEKDE KTAHNLENVL.I H PWERLSE I CVAKI SE PEAD VES VLGVS 
NLTiQVLQKPKGSLKSSKKKNGKVRFADE ILESNKENEKCVS 5 EG 
EOECWEiTTEPSLTHNSSGLLSPIJliaCPIiEDLVCKLADISINY 
VNERKSEQHI^FLSTLI^SFSSSRVFK^LGDEKQSIVQAKPLE 
IAKLVQKNPAVQFLYQKLIGWLNEDQRKDFGFLVDILYSALRCC 
DNDMERKKVLDDDTKVDL KWNSLLK 1 1 EKACPS S D KHALVTPWL» 
KGDIIX5EKI*VNLADCLCNEDLESRVSSESHFSERWT1JL,SLVLSQ 
HVKNDYIiIGDVYVERIIVRLHETLFKTKKLSEAESSDSSVSFIC 
DVAYNYFS SAKGCLLMPSSEDLLLTXFQLCAQS KEKTHL»PDFL I 
CKL KNTWIjSG VNX.L VHQTDS S Y KEST F LHLS ALV3 LKNQ VQAS S L» 
D INSLQVIjLSAVDDLLNTLLES E DS YLMGVYIGSVM pndse WEK 
MRQSLPMQHfLHRPLLEGRLSLNYECFKTDFKEQDI KTLPSHLCT 
SALI^KKVl.IAI^KETVLENNELEKIIAELLYSLQWCEELDNPP 
IFLIGFCEII^KmiTYDNLRVLGNMSGLLQLLFNRSREHGTLW 
SLIIAKLILSRSISSDEVKPHYKRKESFFPI/TEGNLHTIQSLCP 
FLSKEEKKEFSAQCI PALLGWTKKDLCSTNGGFGHLAIFNSCLQ 
TKS IDDGEIiLHGILKI I ISWKKEHEDIFLFS CNLSEAS PEVLGV 
NI E 1 1 RFLSLFLKYCSS PIAESEWDFIMCSMIAWLETTSENQAL 
YSI PLVQLFACVSCDIACDLSAFFDSTTLDTIGNLPVNliISEWK 
EFFSQGIHSI*I*LPILVTVTGENKDVSETS FQNAMLKPMCETliTY 
I SKEQLLSHKLPARIiVADQKTNIjPEYIjQTIil^TLAPLLLFRARP 
VQIAVYHML YKLMPELPQYDQDNLKS YGDEEEEPALS PPAALMS 
ULSIQEDLLENVLGCI P VG0I VTI KPLSEDFCYVLGYLLTWKLI 
LTFFKAASS QLRALYSMYliRKTKSLNKJ^YHLFRLMPENPT YAE 
TAVEVPNKD P KTFFTEELQLS IRETTMLPYHIPHIiACSVYHMTL 
KDLPAMVRL WWNS S EKRVFN I VDRFTS KYVS S VLS FQEISSVQT 
STQLFNGMTVKARATTREVMATYTIEDIVIEI*I IQI>PSNYPLGS 
I IVESG KR VG VAVQQWRNWMLQI»S TYLTHQNG S IMEGXxALWKNN 
VDKR FEG VEDCMI C FSVIHGFNYSLP KKACRTCKKKFHS A\ CLY 
KWFTSSNKSTCSLCRETFF 


6042 


1306 


253 


MAEIAPAS PSD I KAS VSNGDTTLLCSRRQSCGMNE VRQ VS LTYP 
GSPAPSHSLPLQPRSGGSLCPSRAW/PDPHQLFDDTSSAQSRGY 
GAQRAPGGI*SYPAASPTPHAAFLADPVSNMAMAYGSSLAAQGKE 
LVDKNIDRFIP ITKLKYYFAVDTMYVGRKLGLLFFP YLHQDWEV 
QYCQDTPVAPRFDVNAPDLYI PAMAFITYVLVAGIALGTQDRFS 
PDLLGLQASSAIAWIiTLEVIAI IxLSLYLVTVNTDLTTIDLVAFL 
GYKYVGM I GGVLKGLLFGKIG Y YLVLGWCCVAI FVFMIRTLRIjK 
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ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corre sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A-Alanine, C=Cysteine, D=Aspartic Acid, B- 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=I>ysine, 
L= Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutaraine, R=Arginine, 
S=3erine, T^Threonine, VWValine, 
W=Tryp t ophan , Y= Tyrosine , X=TJnknown , * =S top 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








IlJU>AAAEGVPVRGARNQLRmijTMAVAAAQPMLMYWI J TFHLVR 


6043 


403 


599 


LCLFFFFPCATPVLPL»PSIjISAI./CLSH1>SVSSWFCPCQPPLPC 
PLP PLQN KTAKGSLSTEQS ERG 


6044 


793 


412 


KiEMWNFTLISKVKISREVTMIASKFGIGQQVRHSLLGYLGVVV 
DIDP VY S I^E PSPDELAVNDELRAAPWYHVVMEDDNGLPVHTYIj 
AEAQLSS BLQDEHP \EQPSMDELAQTI RKQLQAPRIiRN 


6045 


155 


2299 


SPLPQVAAMNYIiRRRI*SDSNFMANLPNGYMTDLQRPQPPPPPPG 
AHSPGATPGPGTATAERSSGVAPAASPAAPSPGSS GGGG FFSSL 
SNAVKQTTAAAAATFSEQVGGGSGGAGRGGAASRVLLVI DEPKT 
DWAKyFKGKKIHGEIDIKVEQABFSDLNLVAHANGGFSVDMEVL 
RNGVKWRS LKPD FVLi I RQHAFS MARNG D YRS L VI G LQ Y AG I PS 
WSI^SVYNFCDKPWVFAQMVRl^TKKLGTEEFPIilDQT FYPNHK 
EMLSS\TTYPVVVKMGHGTLWGWGKVKV1>NQHDFQDIASVVAI*T 
KTYATAE P F I DAKYDVR VQ KI GQNYKA YMRT S VSGNWKTNTG S A 
MIiEQIAMS DRYKLWVDTCSE I FGGLDI CAVEALJiGKDGRDH 1 1 E 
VVGSSMPLIGDHQDBDKQLrVELVVNKMAQALPRQRQRBASPGR 
GSHGQTPSPGALPLGRQTSQQPAGPPAQQRPPPQGGPPQPGPGP 
QRO<SPPr^RPPPCGQQH]^SGIX3PPAGSPLP0RX,PSPTSAPQQP 
AS QAAP PTQG QGRQSR P VAGG PGAP PAARP PAS PS PQRQAGPPQ 
ATRQTSVSGPAPPKASGAPPGGQQRQGPPQKPPGPAGPTRQASQ 
AGPVP3TG PP TTQQPRPSGPGPAGRPKPQLAQKPSQDVP PPAXA 
AAGGPPHPQLNKSQSLTNAFNIiPEPAPPRPSLSQDEVKABTIRS 
LRKSFASLFSD 


6046 


212 


1075 


EGI*TG PCERV PFLiIiG RG P PHGATRAGHRRAVRWAG PES LPP LPR 
SL 1MDS PRAGTHQGPLDAETEVGADRCTSTAYQEQRPQVEQVGK 
QAPI*S PGLPAMGGPGPGP CEDPAGAGGAGAGGSEPIiVTVTVQCA 
FTVALRARRGADI^SLRALIjGQALPHQXAQIXSQLSYIAPGEDGH 
WVPIPE2ESLQRAWQDAAACPRGLQLQCRGAGGRPVLYQWAQH 
SYSAQG PEDLGET^QGDTVDVLiCEVDQAWL>EGHCDGRIG I FPKCF 
WPAGPRMSGAPGRIiPRSQQGDQP 


6047 

• 

i 


49 


1405 

* 


PVIiVTSIiRMREADTLRPPQLMEVSADI I STVE FNHTGELLATGD 
KGGRWIFQREPESKNAPHSQGEYDVYSTFQSHEPEFDYLKSLE 
IEEKINKIKWLPQQNAAHSLLSTNDKTIKLWKITERDKRPEGYN 
LKDEEGKLKDLSTVTSLQVPVLKPMDLMVEVS PRR I FANGHTYH 
INS I S VNS DCETYMSADDLRINIjWHIiAI TDRS FTP \NI VDI KPA 
NMEDLTEVITASEFHPHHCNLFVYSSSKGSLRLCDMRAAALCDK 
HSKLFEEPEDPSNRSFFSE I IS\SVSDVFG?SHSDRYMLTR\DY1> 

TVKVWDL\NMEARPIETYQVHDYTiRS KLCSLYENDCI FDKFECA 
WNGSDSVIMTGA\YNNFFPJ1FDRNTKOTVTIi\EASRESSKPRAV 
LKPRRVCVGGKRRRDDI S VDSLDFTKKI LHTAWHP AEN I IAIAA 
TNNLYI FQDKVNSDMH 


6048 


1 


3194 


GIRTPKFCDSPTSDLEMRNGRGRGKRMRPNSNTPVNETATASDS 

KGTSNSS KTRAGANS KGRRGS QNSS EHRP PASSTS EDVKAS PSS 

ANKRKNKPLSDMELNSSSEDSKGSKRVRTNSMGSATGPLPGTKV 

EPTVLDRNCPSPVLIDCPH PNCNKKYKH INGLKYHQAHAHTDDD 

SKPEAIX5DSEYGEEPILHADLGSCNG\ASVSQK\GSLSPARSAT 

PKWLVEPHSPSPSSKFSTKGLC^OCKLSGEGDTDLGALSNDGSD 

IX?PSVMDETSNDAFDSLERKCMEKEKCKKPSSIjKPEKIPSKSIjK 

SARPI /APLAl PPQQ I YTFQTATFTAAS pgsssgltatvaqamp 

NSPQLKPIQPKPTVMGE^FTVNPALTPAXDKKKiQOKXKKESS 

LESPLTPGKVCRAEEGKSPFRESSGNGMKMEGIiIjNGSSDPHQSR 

LASIKAEADKIYSFTDNAPSPSIGGSSRliENTTPTQPLTPLHVV 

TO^AEASSVKTWSPAYSDISDAGEIKSEGKVDSVKSKDAEQLVK 

EGAKKTIjFPPQPCjS KDS P YYQGFES YYS PS YAQSS PGALNPSSQ 

AGVESQAIJCTKRDEEPES1EGKVKNDICEEKKPEI*SSSSQQPSV 

ICKJRPKMYMQSLYYNQYAYVPPYGYSDQSYHTHIiIiSTNTAYRQQ 

YEEQQKRQSLEQC^RGVDKKAEMGLKEREAAIjKEEWKQKPSIPP 

TLTKAP S LTDLVKSG PG KAKE PGADPAKS V 1 1 PKLDDSSKLiPGQ 

AP EG L«KVKLSDASHLS KEAS EAKTGAE CGRQ AEMD P I LWYRQEA 

EPRKWTYVYPAKYSDIKSEDERWKEERDRKLKEERSRSKE>SVPK 
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ID 
NO: 


freciaccea 
beginning 
nucleotide 
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correspond i ng 
to first 

a m*i r>o acid 

residue of 
amino acid 
sequence 


preuiccea e&c 
nucleotide k 
location ; 
corre sponding 
to first 
amino acid 

amino acid 
sequence 


Amino acid segment containing signal peptide ~~ i 
(A= Alanine, C=Cysteine, D=Aspartic Acid, E= 
: Glutamic Acid, F=Phenyl alanine, G-Glycine, 
H«Histidine, I^Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glut amine , R=Arginine, 
^ — sctiiie, i — i iiicuninc , vsvaiing, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possil>le nucleotide insertion) 








EDGKESTSSDCKLPTSEBSRI^SKEPRPSVHVPVSSPLTQHQSY 
I PYMHG YS YSQSYDPNHPSyRSMPAV>3MQNYPGS YLPSSYSFS P 
YGSKVSGGEDADKARASPSVTCKSSSESXALDILQQHASHYKSK 
2»f i A^l/^lM^bKJJKtjtoCXjVVGGGGSCSS VGGASGGERSVDRPRT 
S PSQRLM S THHHHHHIjG Y S LL P AQ YN L P Y AAGLS S TAI VAS QQG 
STPSLYPPPRR 


604 9 


215 


1089 


AMTGVFDRRVPSIRSGDFQAPFQTSAAMHHPSQESPTLiPESSAT 
DSDYYS PTGGAPHG Y CS PTSAS YG\ KAI*NP YQYQYHG VNGSAGS 
Y PAKAYAD YS YAS S YHQYGGAYNR VPSATNQPEKEVTE P EVRMV 
NG KP KKVR KP RT I YS S FQLiAALQRRFQ KTQ Y I^LP ERAELAASL 
GLTQTQVKI WFQNKRS Kl KKIMKNGEM PPEHS PSSSDPMACNSP 
QS PAVWE PQGS S RSL S HHPHAHPPTSNQS PASS YLENS AS WYTS 
AASSINSHJbPPPGSX»QHPLiAXASGTLY 


5050 


566 


1718 


KGLBRTCCAMEESDSEKTTEKENLGPRMDPPLGEPGXGSIiGWVI* 
PNTAMKKKVLLMGXSGSGKTSMRS 1 1 F ANY IARDTRRLGAT I LD 
RIHSI^INSSLSTYSLVDSVGTmCTFDVEHSHVRFLGNLVLNLW 
DCXjGODTFMENYFTSQRDNI FKNVEVL I Y VFDVESRELEKDMH Y 
YQSCLEAILQNSPDAKI FCLVH KMDLVQEDQRDLI FKEREEDLR 
RLSRPLECS CFRTS I WDETLYKAWS3IVYQLI PNVQQLEMNLRN 
FAEI I EADEVLLFERATFLVISHYQCKEQRDAHRFEKI SNI I KQ 
FKI*S CSKLAASFQSME VRNSNFAAFI DI FTSNTYVMWMSDPSI 
PSAATLINI RNARKHFE KLERVDG P KQCLLMR 


6051 


566 


1718 


KGLERTCC^4EESDSEKTTEKEN1jG?^DPP]jGEPG\GSIjGWVI, 
PNTAMKKKVLLMGKSGSGKTSMRSIIFANYIARDTRRlrGATILD 
R I HS LQ INS S LSTYS LVDS VGNTKTFDVEHSH^/RFLGNLVLNLW 
DCGGQDTFMENYFTSQRDNIFRNVEVLIYVFDVESRELEKDMHY 
YQS CLEAI LQNS PDAKI FCLVHKMDLVQEDQRDLI FKEREEDLR 
RLS R PLECS C FRTS IWDETLYXAWSS I VYQL I PNVQQLEMNLRN 
FAEI IEADBVLLFERATFLVISHYQCKEQRDAHRFEKISNI IRQ 
FKLSCSK1AASFQSMEVRNSNFAAFIDIFTSMTYVMVVMSDPSI 
PSAATLINIRNARXHFEKLERVDGPKQCLIiMR 


6052 


566 


1718 


KGLERTCCAI^ESDSEKTTEKENLGPRMDPPIjGEPG\GSLGWVL 
PNTAMKKKVLIiMGKSGSGKTSMRS 1 1 FANY IARDTRRLGAT ILD 
RIHSLQINSSLSTYSLVDSVG^KTFDVEHSRVRFLGNLVLNLW 
DCGGQDTFMENYFTSQRDN I FRNVEVLI YVFDVE S RELEKDMHY 
YQSCLEAILQNS PDAKI FCLVHKMDLVQEDQRDLI FKEREEDLR 
RLSRPLECSCFRTS IWDETLYKAWSS I VYQLI PNVQQLEMNLRN 
FAEI IEADEVLL FERATFLVT SHYQCKEQRDAHRFEKI SNI IKQ 
FKLS CSKLAAS FQS MEVRNSNFAAF I D I FTSNTYVMWMSDPSI 
PSAATLINIRNARKHFEKLERVDGPKQCLLMR 


6053 


201 

• 


1704 


KGTEMNKSRWQSRRRHGRRSHQQNP WFRLRDSEDRSDS RAAQPA 
HDSGKGDDES PS TS SGTAGTS S VPELPG FYFD PEKKR YFRLLPG 
HNNCKPLTKESIRQKEMESKRLRLLQEEDRRKKIARMGFNASSM 
LRKSQLGFLNVTNYCHLAHELRI^CMERKKVQIRS^ 
RFOT*ILADTNSDRLFTVNDVTVGGSKYGI INLQSLKTPTLKVFM 
HEtnjYKlTJRKV\NSVCWA5LNHLDSHILLCLMGXAETPGCATLL 
PASLFVNSHPAG I DRPG\MLCS FR I PGXWS CAWSLN IQANNCFS 
TGLSRRVLLTNVVTGHRQS FGTNSDVLAQQ FALMAPLLFNGCRS 
GEIFAIDLRCGNQGKGWKATRLFHDSAVTSVRILQDEQYLMASD 
MAGKIIO^WDLRTTKCnmQYEGHVNEYAYLPLHVHEEEGILVAVG 

APGLLMAVGQDLYCYSYS 


6054 


1 


1054 


P ? IARLQE FGTSRRHMAAPSG VH LLVRRGSHRI FS S PLNHI YLH 
KQSS SQQRRN FFFRRQ RDI SHS I VLPAAVS SAHPVPKHI KKPDY 
VTTGIVPDWGDSIEVKIJEDQIQGLHQACQLARHVIJjIAGK^ 
DMTTEE IDALVHRE 1 1 SHNAYPS PLGYGGFPKSVCTSVNNVLCH 
GIPDSRPLQDGDI INIDVTVYYNGYHGDTSETFLVGNVDECGKK 
LVEVARRCRDEAIAACRAGAPFSVI GNTI SHITHQNGFQVCPHF 
VGHGIGS YFHGHPE I WHHANDS DLPMEEGMAFT IEP I ITEGS PE 
FKVLEDAWTWSLD/TS KVS AQFEHTVL I TS RGAQI LTKLPHEA 
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SSQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corre spondi ng 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corre sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine , D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Histidine, I«Isoleucine, K=» Lysine , 
L=Leucine ( M=Me thi onine , N=Asparagine, 
P- Proline, Q=Glut amine, R=Arginine, 
S=Serine, T=Thre onine , Vs=Valine, 
W=Tryptophan, Y^Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 


6055 


421 

• 


2364 


PPYFLLS FLAWWLYGQSDRTETDI SQSAGPPPGTLQCSALHHDP 
GCANCSRFCRDCSPPACQOmiVFPGNAI^GVQPPBI^RTIALI 
SSREP PRKKKKSQTETG KERBRTS FLTQGGKRFELQHGLAG I CM 
TLIiITGDS I VSAEAWDHVTMANRELAFKAGD VI KVXiDASNKDW 
WWGOIDDEEfiW FPASFVRL.WVNHKDEVEEGPSDVONGH1J5PNSD 
CLCLGRPIXJNRDQMRANVINE I MSTERHY I KHLKD I CEG YUCQC 
RKRRDMFS DEQLKVI FGN IEDIYR FQMG FVRDLEKQYNNDD PHL 
SEIGPCFLEHQDGFWIYSBYCNNHLDACMELSKIiMKDSRYQHFF 
EACRLLQQM I D I A\ I PGFLLTPVQKI CKY PLQLAEI »Ti KYTAQDH 
SD YR WAAAIiAVMRNVTQQINERKRRLEN IDK LAQWQAS VLDWE 
GBDILDRSSEIjIYTGEMAWIYQPXYGRJNQQRVFFIjFDHQMVI^ZK 
KDLIRRDILYYKGRID^KYEVVDIKIX5RDDDFNVSMKNAFKLH 
NKETEEIHLFFAKKLEEKIRWLRAFREERKMVQEDEKIGFEISE 
NQKRQAAMTVRKV PKQ KGVNSARS VP PS YPP PQDPLNHGQ. YLVP 
\DG I AQ3QVFE FTEPKRSQS PFWQNFSRLTPFKK 


6056 


; 43 


3358 


SGGRGPVRVRSEQI*SPSAEQVSQISQISUjRRPI*SSLPPPPSRA 
IAPTRAPIXEAIiTIMEVAEVESPIjNPSCKIMTFRPSMEEFREFNK 

ylaymeskgahraglakvtppkewkprqcyddidnlli papiqq 

MVTGQSGLFTQ YNI QKKAMTVKEFRQLANSGKYCTPRYIJJYEDL 
ERKYWKNI/TFVAP I YGADINGSI YDEGVDEWNIARJLNTVLDVVE 
EECGIS I EGVN TP Y L Y FG WW KTT F AWHTEDMDL Y S INYLHFGEP 
KSWYAIPPEHGKRIiEPJLAQGFFPSSSQGCDAFLRHKMTLISPSV 
LKKYGI PFDKITQEAGE FMI TFP YGYHAGFNHGFNCAESTNFAT 
VRWIDYGKVAKLCTC3lKDMVKISMDIFVRKFQPDRYQIiWKQGKD 
I YT I DHTKPTPASTPEVKAVn^RRRKVRKASRS FQCARSTS KRP 
KADEEEEVSDEVDGAEVPNPDSVTDDLKVSEKSEAAVKLRNTEA 
SSEEESSASRMQVEQN1»SDHIKLSGNSCI*STSVTEDIKTEDDKA 
YAYRSVPSISSEADDSIPLSTGYEKPEXSDPSELSWPKSPESCS 
SVAESNGVIiTEGEESDVESHGNGLEPGEI PAVPSGERNS FKVPS 
IAEGENKTSKSWRH PLSRPPARSPMTLVKQQAPSDEELPBVLS 1 

CTIjLMPYHKPDSSNEENDARWEITCLDEVVTSEGKTKPLIPEMCF 
I YSEENI EYSPPNAFI»EEDGTS LIjI SCAKCCVRVBAS cygi psh 
E I CDG WLCARC KRNAWT AE C CXi c^^rgg alkqtknnkw ahvmca 
VAVPBVRFTNVPERTQIDVGRI PLQRLKL KCI FCRHRVKRVSGA 
CIQCS YGRCPAS FHVTCAHAAGVTj\MEPDDWPYWNI TCFRHKV 
NPNVKSKACEKVI SVGQTVITKHRNTRYYSCRVMAVTSQTFYEV 
M FTIDR *5 F^RDTFPPD I VSRDCLKLG PPAEGEWOVKW PDGKIiYG 
AKY FGSN I AHM YQ VE FEDGSQIAMKRED I YTUDEELPKRVKARF 
VSAGRCHLGTCQVNS LS S PHVS QAQQETYLGFW 1NSKKS QCNI F 
LSGTY 


6057 


1 


853 


FVARliKEQEGEGGLG PRKEKGRARGRERRRRMQI/TR CC FVFliVQ 
GSLYIjVICGQDDGPPGSEDPERDDHEGQPRPRVPRKRGHISPKS 
RPNANSTLLGLI^PGEAWGIIX5QPPNRPNHSPPPSAKVKKIFG 
WGDFYSNI KTVALNLLVTGKI VDHGNGTFSVHFQHNATGQGNIS 
T <; L.VPPS KAVFFHOEOO IFI F AKASKI FNC V RMEWE KVE \RGRR 
TSLFTHDPAKTCSREHAQSSATWSCS^PFKWCVYIAFYSTDYR 
LVQKVCPDYNYHSDTPYYPSG 


6056 


1 


986 

i 


HPLPSASIXSLPSVSI/SVSbCVR^ALLEAWP^PKRRRARVGS P 
SGDAASSTPPSTRFPGVAI YI.VE PRMGRS RRAFLTGIiARS KGFR 
VLDACSS EATHVVMEETS AEEAVS WQERRMAAAP PGCT PPALLD 
I S WLTES I/3AGQPVP VECRHRLEVAGPS KGPLS PAWMPAYACQR 
P^PLTHHNTGLSEALEILAEAAGFEGSEGRLLTFCRAASVIJCAX 
PS P VTTLS QLQGLPHFGEHSSR WQELLEHGVCEEVERVRRSB / j 
RLFTQI FGVGVKTADRWYREGLRTIiDDIiREQPQKLTQQQKAGEP 
SREAGPWASLNCTLDPSASTP 


6059 


2 


3650 


QQDFSSLADLTDHRAHRC PGDGDDDPQI>S WVAS SPSS KDVAS PT 
QMIGDGCDiGIiGEEEGGTGLP YPCQFCDKS FIRLSYUCRHEQIH 
SDKL? FKCTY CS RLF KHKRS RDRHI K1*HTGDKKYHCHE CEAAFS 
RSDHI^HLKTHSSSKPFKCIVCKRGFSSTSSLOSHMQAHKKNK 
EHLAKS EKEAKKDDFMCDYCEDTFSGTEELEKHVIiTRHPQLSEK 
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SEQ 
ZD 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
correspond ing 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
{A=Alanine, C= Cysteine , D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
R=Histidine, I=Isoleucine, K=I»ysine, 
L= Leucine, M=Me thioni ne , N=Aspa ragine , 
P=Proline, Q-Glutamine, R=Arginine, 
S»Serine f T=Threoni_ne , V=Valine, 
w=Tryptopnan, Y=Tyxosine, X= Unknown, *-SXLOp 
Codon, /^possible nucleotide deletion, 
\=possii>le nucleotide insertion) 






• 


ADIjQCIHCPEVFVDENTIjLAH I HQAHANQKHXCPMCPE \QFSSV 
\EGVYCHLDSHRQPDSSNHSVSPDPVLGSVASMSSATPDSSASV 
ERGSTTOSTLKPI^GQKKMRDDGQGWTKVVYSCPYCSKRDFNSI* 
AVLEIHLKTIHADKPQQSHTCQICLDSMPTLYNI^EHVRKX«HKN 
HAYPVMQFGNISAJFliCNYCPEMFADINSIjQEHIRVSHCGPNANP 
SIX5RNAFFQTQCSMGFI>TESSIiTEHIQ\Q\AHCSVGSAKLESPV 
VQPTQSFMEVYSCPYCTNSPIFGS ILKLTKH I KENHKNIPLAHS 
KKS KAEQS P VS SDVEVS S PKRQRI>S ASANS I SNGEYPCNQCDLK 
FSNFES FQTHXKLHLELLIJ^QACPOCKEDFBSQESIiIiQH LTVH 
Y>rrrSTHYVCF^CDKQFSSVDD\l^KH\LLDMPHPLCCTHCT\L 
CQEVFDS\ KVS I \QVHIAVKHSNEKKMYRCTACNWDFRKEADLtQ 
VHVKHSHIiGNPAKAHKC I FOGETFSTEVELQCHITTHS KKYNCK 
pro KAPHA T TTiTiK KHTiT? K KHCVFDAATENGTANG VPPMAT KKAE 
PADLQGMLLKNPEAPNSHEASEDDVDASEPMYGCBICGAAYTME 
VLLQNHRIiRDHNrRPGEDDGSRKKAEFTKGSHKCNVCSRTFFSE 
NGLREHLQTHRG PAKHYMC P I CGERFPSliIfTIiTEHKVTHS KSUD 
TGTCRI CKMPI*QS EEE F I EHCCMHPDLRNS LTG FRCW CMQTVT 
STI^LKIHGTFR^QKLAGSSAASSPNGQGLQKLYKCALCLKEFR 
SKQDLVKIJ)VNGLPYGIjCAGCMAJRSANGQVGGIiAPPEPADRPC3V 
GLRCPECSVKFES AEDIiESHMQVDHRDLTPETSG PRKGTQTS PV 
PRKKTYQCI KCQMTFENE RE I QIHVANHMI EEG INHE CKL.CNQM 
FDS PAKIiIiCHL I EHS FEGMGGTFKCP VCFTV FVQ ANKLQQH I FA 
VHGQEDKIYDCSQC^KFFFQTELQNHTMSQHAQ 


6060 


2145 


202 

■ 


SYE I VGKNKI>E VNHSQI*KAIjCKCSIiPSRXiPIX»ENl»PLtLDRG FR 
KEPRSRGSRERDNMIiHLHHSCLCFRSWLPAMIAVIiLSIAPSASS 

DI SAS R PNIIil«LMADDLfG I GDIGCYGNN TMR TPN I DRLAEDG VK 
LTQKISAASLCTPSRAAFLTGRYPVRSGiMVSSIGYRVLQWTGAS 

HHGFDHFYGMPFSLMGDCARWEIjSEKR\mi^QKI.NFLFQVLAI>V 

ALTLVAG KLTHXi IPVSWMPVI WSAI*SAVL.IiLASS YFVGAI»I VHA 
DCFI^RNHTITEQPMCFQRTTPLII^EVASFLKRNKHGPFLiFV 

SFIJJVHIPLITMF^FT/SKSLHGLYGDNVKEMD 

EGLSNSTLI YFTS DHGGSLENQI/3NTQYGGWNG I YKGGKGMGGW 

EGGIRVPGIFRWPGVLPAGRVTGEPTSLMDVFPTWRLAGSEVP 

QDRVIDGQDLLPLLLGTAQHSDHEFLMHYCERFLHAARWHQRDR 

GTMWKVHFVTPVFQ?EGAGACYGRKVCPCFGEKVVHHDPFI,I.FD 

LSRD PSE THIIiTPASEPVFYQVMER \ VQQAV WEHQRTL»SPVPLQ 

LDRLGNI WRPWLQPCCGPFPUCWCLREDDPQ 


6061 


110 


133 0 


" ^lNIHMKRKTIKNINTFEWRMI,MITCMPAVRVKTEt^ESEQGSPN 
VHNYPDMEAVPLLI^NlTVnCG EP PEDSIjS VDHFQTQTE PVDLS INK 
ARTSPTAVSSSPVSMTASASSPSSTSTSSSSSSRIiASSPTVITS 
VSSASSSSTVLTPGPLVASASGVGGQQFIiHIIHPVPPSSPMNLQ 
SNKLSHVHRIPVWQS VPWYTAVRS PGNVNNTI WPIiLEDGRG 

hgkaqmdprgbsprqsksdsddddlpnvtldsvnetgstai^sia 

ravqevhps pvsrvrgnrmnnqkfpcs i s p fsi es trrqrtvln 
ppdsrktaystdcdf\egiiqqklytkssspgr\7hrrthtgekpy 

kctwegctwkfarsdeltp^yrkhtgvkp^ 

lalhrrrhmlv 


6062 


71 


1079 


etmakngpencedch i lnaeafkskkickslkicglvfgilabt 
livlfvjgskhfwpevpkkaydmehtfysngekkkiymeidpvtr 
teifrsgngtdetiievhdfkngytgiyfvglqkcfiktqx kvi p 
efsepeeeideneeitttffbqsvi wvpaexp ienrdflkns ki 
lei(^nvtmyw\inptl\isgtfajcqlhhnfafiii>vselqdfe 
eegedlhfpanekkg i e qneqwwpqvkve ktrharq as e help 
xndyteng i e fdpmiiderg ycc i ycrrgnr ycrrvce pllgyyp 
ypycyqggrvicrvimpcnwwvarmlgrv 


6063 


71 


1079 


"ETMAKNGPENCEDCHIIjNAJEAFKSKKICKSIjKICGLVFGILALiT 
LIVLFWGSKHEWPEVPKKAYDMEHTFYSNGEKKKIYMEIDPVTR 
TEI FRSGNGTDETLE VHDFKNG YTG I YFVGLQKCF I KTQ I KVT P 
EFSEPEFJEIDENEEITTTFFEQSVIWVPAEKPIENRDFLKNSKI 

LBICDNVTMYW\INPTL»\ ISGTFAKQLHHNFAF I ILVS FXQDFE 
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SEQ 
ID 
NO: 


Predicted 

beginning 

nucleotide 

location 

cor re spond i ng 

to first 

amino acid 

residue of 

amino acid 

sequence 


Predicted end 
nucleot ide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, OCyeteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G^Glycine, 
H=Histidine, I=Xsoleucine, K= Lysine, 
L=Leucine, M-Mechionine, N=rAsparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y= Tyro sine, X -Unknown, *=Stop 
Cod on, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








EEGBDLHPPANEKKGIEQNEQWWPQVKVEKTRHARQASEEEIjP 
INDYTENGI EFDPMLDERGYCCI YCRRGNRYCRRVCE PLLG YYP 
YPYCYQGGR VTCRVI MPCNWWVARMLGR V 


6064 


913 


311 


Nl a PQSL.PRPTEHSPPYSLEKMTDL»VAVWDVAI>SDGVHKIEFEHG 
TTSGKR WYVDGKEE IRKEWMPKLVGKBTFYVGAAKTKATIN1D 
AI SGFAYEYTI^INGKSLKKYMEDRSKTTNTWVLHMDGENF^ 
LEK3AMDVWCNGKKLETAGEFVDDGTETHFS IGTH\ACYIKAV\ 
SSG\KRKEGIIHTLIVDNREIPEIAS 


6065 


1153 


641 


MS VR VARVAW VRGLGAS YRRGAS S FPVP P PGAQGVAELLRDATG 
AEEEAPWAATEPJWPGQCSVIJjFPGO^SQWGKGIIGIJ^NYPRVR 
BLYAAAPJIVIXSYCIJjELSI^GPQETLDRTVHCQPAIFVASLAAV 
EKLHHLQPSVIENCVAAAGFSVGEFAAIiVFAGAMEFAEG 


6066 


68 


3470 


VKENMPATRKPMRYGHTEGHTEVCFDDSGSFIVTCGSDGDVRIW 
BDLDDDDPKFINVGEKAYSCALKSGKLVTAVSNNTIQVHTFPEG 
VPDG I LTRFTTNANHVVFNGDGTKI AAGS S D \ FLVKI VDVMDSS 
QQKTFRGHDAPVI»SI*SFDPKDI FLxASASCDGSVRVWQISDQTCA 
IS WPLLQKCNDVTNAKS I CRLAWQPKSSKLLAI PVEKSVKLYRR 
ES MSHQFDLSDNFISQTLNIVTWS PCGQYLAAGS INGIiI I VWNV 
ETKJ5CME^VKHEKGYAICGLAWHPTCGRISYTDAEGNLGL.bENV 
CDPSG:<TSSSKVSSRVEKI)YNDLFTX5DDMSNAGDFLNDNAVEIP 
SFSKGI INDDEDDEDLMMASGRPRQRSKILEDDENSVD1SMI*KT 
GSSLLKEEEEDGQEGSIHNLPLVTSQRPFYDGPMPTPRQKPFQS 
GSTPLHDTHRFMVWNS IGI IRCYNDEQDNAI DVEFHDTS IHHAT 
HLSNTIjNYTIADIiSHEAILIACESTDELASKLHCIjHFSSWDSSK 
EWI IDLPQNEDIEAICLGQGWAAAATSAIXLL.RLFTIGGVQKEVF 
SXAGPWS^GHGEQLFIVYmGTGFTCDGCLGVQIil^LGKKKK 
QILHGDPLPLTPJCSYLAWIGFSAEGTPCYVDSEG IVRMLNRGLG 
NTWTPI CNTREHCKGKSDHYWWG 1 HENPQQLRCI PCKGSR FPP 
TLPRPAVAXLSFKLPYCQIATEKGQMEEQFWRSVI FHNHLDYLA 
KircYEYEESTKNQATKEQQELLMKMI»ALSCKLEREFRCV^ 
MTQNAVNLAI KYAS RSRKL ILAQKLS EXAVEKAAELTATQVEEE 
EEEEDER KICLNAG YSNTATEWSQPR FRNQ VEEDAEDSGEADD E E 
KPE IHKPGQNS FS KSTNSSD VS AKSGAVT FSSQGRVNP FKVSAS 
S KEP AMS^SARSTNILDNMGKSSKKSTALSRTTNNEKS P I IKP 
LIPKPKPKQASAASYFQKRNSQTNKTEEVKEENLKNVLSETPAI 
CPPCNTENQRPKTG FQMWLEENRSN I LS DNPDFSDEAD I IKEGM 
IRFRVLSTEERKVWANKAKGETASEGTEAKKRKRVVDESDETEN 
QEEKAKENIiNLSKKQKPLD FSTNQKLS AFAFKQE 


6067 


858 


321 


LPWQRI^VLLSRGKMAVTGWLESI^TAQKTAL^ 
PDGKEMAEE YDEKTS EI^LVRKWRVKSALGAMGQWQLEVGDPAPI* 
GAGWLGPELIKESNANPIFMRKDTKMSFQWRIRNI*PYPKDVYSV 
SVDQKERCI IVRTTNKKYYKKFS IPDLDRHQLPLDDALLSFA\T 
PTAP 


6068 


13 


1730 

* 


GS KMADLANEEKPAIAP PVFVFQKDKGQ KS PAEQKNLS DSGEE P 
RGEAEAPHHGTGH PES AGEHALE P PAPAGAS ASTP PP PAPEAQL 
PPFPREIiAGRSAGGSSPEGGEDSDREDGNYCPPVKRERTSSLTQ 
FPPSQSEERSSGFRLKPPTLIHGQAPSAGLPSQKPKEQQRSVtR 
PAVLQAPQPKALSQTVPSSGTNGVS LPADCTGAVPAAS PDTAAW 
RS P S EAADE VCALEEKE PQKNESSNASE E EACEKKDPATQQAFV 
FGQNU^RVTOjINESVDEADMENAGHPSADTPTATl^FIiQYISS 
S IJBN3TNSADASSNKFVFGQNMS ERVLS P P KLNEVSSDANRENA 
AAESGSESSSQEATPEKESLAESAAAYTKATARKCLLEKVEVIT 
GEEAESNVIiQMQCKLFVroKTSQSWERGRGlJjRI^DH^STD^ 
TIX3SR LSDAGPRGSLR\ LI LNTKLWAQMQ I DKAS EK\S I R I TAM 
DNEDQGVKVFLISASSK3>TGQVYAALHHRIIAI>RSRVEQEQEAK 
M P APE PGAAP SNEEDDS DDDDVLAPSGATAAGAGDEGDGQTTGS 
T 


6069 


583' 


27 


PTRPGQAGSSSAMAAQRLGKRVLS KLQSPSRARGPGGS PGGLQK 
RHARVTVKYDRRELQRRLDVEKW IDGRLEEL YRGMEADM PDE IN 
IDELLELESEEERSRKIQGLLKSCGKPVEDF I QELLAKLQGLHR 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
locat ion 
cor re sponding 

CO IIiSl 

amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
tA=Alanme, c=uysteme # u-Aspartic aciq, t.— 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K-Lysine, 

T.sI.oi^n^ no U— Me> f H i riTi ^ rto V— 2 ena ef 1 TW* 
1 * = * *" \* " \ I I , n»nc \~ xxx um> lie , n— rtspatayxjic | 

P= Pro line , Q=Glut amine, R=Arginine, 
S=Serine, T=Threonine, V= Valine, 
w=Tryptophan, Y=Tyrosine, X=Unknovn, *=Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








QAPGLRQPS PS? \K^PSAPFO<5PGARTASPLTLIALFPGP PER 
RPAliCVLSCI 


6070 


478 


858 


IRVT^GBFLHYIFPI^FI^SPEW/RirrETHRGRHF\QVTLTAE 
TDCR YVSWRRKKLYKL FAQHRY I SRLFSVLIGSDIADKLYALND 
RVY IGKRYEYD I RLPNFYQMST PE I RRS PLTQHFQNSRR Y*J 


6071 


2 

t 

• 


1654 


HEARTKGNMALARP \ VRLFSLVTRLLLAPRRGLTVRS P DE PLP V 
VRI PVAiy3RQLEQRQSRRRNLPRPVI.VRPGPLLVSARRPEl.NQP 
ARLTLGRWERAPLASQGWKSRRARRDHFS I ERAQQEAPAVRKJbS 
S KG S FADLGAW K PRVLHALQE \ AAP EWQ \ PTTVQSSTI PSLLR 
GRHWCAAETGSGKTI^YIxLPLIXJRlxL^ 

VL VP S R3LAQQ VRAVAQ P LGRSLG LLVR DLF.GGHGMRR I RLQLS 
RQPSADVLVATPGALWKALKSRLI SIuEQLSFLVLDEADTLLDHS 
FLELVDYTLEKSHIAEGPADLEDPFNPKAQLVLVGATFPEGVGQ 
llnkvas pdavttitssklihcimphvkg^flri^gadkva^ 
IIJCHRDRAEJITGPSGTVLVFC^SSSTVNWLGYII^DHKI 
QGQMPALMRVGI FQSFQKSSRDILuirx ijIAoKWLjJL/o x\jVJSjj v vw 
YDFP PTLQDYIHRAGRVGRVGSEVPGTVISFVTHPWDVSLVQKI 
ELAARRRRSLPGLASSVKEPLPQAT 


6072 


1 


742 


KMERTEMMPTINSQLEFKSKPFPLVSSSRWLVKRGELTAYVEDT 
. VLFSRRTSKQQVYFFLFNDVLI ITKKKSEESYNVNDYSLRDQLL 
VESCDNEELNS S PGKNSS TMIi YSRQS SASHLPTLTVLSNHANEK 
VEML LGAETQ S ERARW I TALGHS SGKPPADRTS LTQVE I VR S FT 
AKQPDELS LQVADWL I \ YQRVSDGWYEGER \LRDGERGW FPME 
CAKE I TCQAT IDKNVERMGRLLGLETNV 


6073 


620 


B60 


PCRRGLARPLSRRPG/ S I LVHCAVGVSRSATLVLAYLMLYHHLT 
LVEA X KKVKDHRG 1 1 PNKGFLRQLLALDRRLRQGLEA 


6074 


168 


1110 


PGARCMATELQCPDSMPCHNQQVNSASTPSPEQLRPGDLILDHA 
GGNRASRAKVILLTGYAHS SLPAELDSGACGGSSLNSEGNSGSG 
DSSSYDAPAGNSFLEDCELSRQIGAQLKIX^MNDQIRELQTI IR 
DKTASRGDFMFSADRLIRLWEEGLNQL PYKECMVTTPTG iKiB 
GVKFEKGNCGVSIMRSGEAMEG^3IxRDCCRSIRIGKILIQSDEET 
QRAKVTYAKFPPDIYRRKVLLMYPILQTG\NTVIEAVKVL1EHG 
VQPSVI ILLSLFSTPHGAKS I IQEFPEITI LTTEVHPVAPTHFG 
QKYFGTD 


6075 


320 


1091 


" PPTCQPQEVEHH\YGYVPIIiGNKTLPSRCHQCVIVSSSSHLLGT 
KLGPE I ERAECT I RMNDAPTTG YS Al^GNKTTYR VVAH S SVFRV 
LRRPQEFVNRTPETVFIFWGPPSKMQKPQGSLVRVIQRAGLVFP 
NMEAYAVS PGRMRQFDDLFRGETGKDREKSHSWLSTGWFTMVIA 
VELCDHVHVYGMVPPKTYCSQRPRLQRMPYHYYEPKGPDECVTYI 
QNEHSRKGNHHRFI TEKRVFSS WAQLYG I TFSHPS WT 


6076 


1721 


107 


HPSPTEAPRVQHLTMDCTWRILFLVAAATGTHAQVQLVQSGAEV 
KKPGASVKVS CKVSG YTLTELS MHWVRQAPGKGLEWMGAFDPED 
GET X YAQivr QGRVTMTaPTib xxv 1 AiMfii.«*>^"w&u xxi v x xw-^xl/ 
HGD YAFDI WGQGTM VTVS SAPTKAPDVFP 1 1 S GCRHP KDNS P W 
lxACLITCYIxPTSV\TVTWYMGTQSQA\QRTFPEIQRRDSYYMTS 
SQI^TPLQQWRC^EYKCVVQHTASKSKKEIFRWPESPKAQASSV 
PTAQPQAEGSLAKATTAPATTRNTGRGGEEKKKEKEKEEQEERE 
TKTPECPSHaX2PLGVYLLTPAVQDLWLRDKATFTCFWGSDLKI) 
AHLTWEVAG KVPTGG VE EGLLERHSNGS Q S QHS RLTL PRS LWNA 
GTSVTCTIjNHPSI^PORLMALREPAAQAPVKLS lnllas sdppe 
A\ASWLLCEVSGFSPPNrLLMWLEDHGEVNTSGFAPARPL?KP\ 
RSTTFWA\WSV1^VPAPPSPQPATYTCWSHEDSRTLLNASRSL 

EVSYVTDHGPMK 


6077 


3687 


1266 

... 


LLPDMNLQPI FWIGLISSVCCVFAQTDENRCLXANAKSCGECIQ 
AGPNCGWCTNSTFLQEGMPTSARCDDLEALKKKGCPPDDIENPR 

G S KD I KKNKNVTNRS KGTAEKLKPEDITQ I Q PQQLVLRLRSGEP 
QTFTLKFKRAEDYPIDLYYT^\DLSYSMKDDLENVKSLGTDI>IN 
E>n^ITSDFRIGFGSFVEKTVMPYISTTPAZLRNPCTSEQNCTS 
P FS YKNVl^ LTNK13EVFNELVGKQRI SGl^DS PEGGFDAIMQVA 
VCGSLI GWRNVTRLLVFS TDAGFHFAGlX5KIiGGI VIjPNDGQCHL 
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SEQ 

n> 

NO: 


Predicted 
beginning 
nucleotide 
location 
cor re sponding 
to first 
amino acid 
residue of 
amino acid 
! sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A= Alanine , C= Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F-Phenylalajiine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L= Leucine, M= Methionine , N=Asparagine, 
P=Proline, Q=Glut amine , R=Arginine, 
S s= Serine . T=Threonine , V=Valine, 
W=Tryptophan, Y= Tyrosine, X=Unknown , *=Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








ENNMYTMSHYyDYPSIAHLVQKLSEttNIQTIFAVTEEFQPVYKE 
LKNLIP KSAVGTLSANSSNVIQLI IDAYNSLS SEV1 LENGKLSE 
GVTISYQS Y\ CKNGVNGTGENGRKCSNI SIGDEVQFEISITSNK 
CPKXDSDSFKIRPIjGFTEBVEVTLiQYICECECQSEGIPESPKCH 
EGNGTFECGACRCNEGR\rcRHCECSTDEVNSEDIGCFTARKENQ 
FQKSASNHGRVPSAGQCVCRKRDNTNE I YSGKFCECDNFNCDRS 
NGX»I CGGNGV CKCRVCECNPNYTG SACDCSLDTSTCEASNGQ I C 
NGRGICECGVCKCTTDPKFOGQTCEMC^TCZIjGVCAEHKECVQCRA 
FNKGEKKDTCTQECSYFNITKVESRDKIiPQPVQPDPVSHCKEKD 
VDDCWFYPTYSVNGI^EVMVHVVENPECPTGPDI IPIVAGWAG 
I VL I GLALLL I WKLLM I IHDRRE FAKFE KEKMNAKWDTGENP I Y 
KSAVTTWNPKYEGK 


6078 


1426 


180 


ETEDVMELLEEDLTCP I CCSLFDDPRVLPCSHNFCKKCLEGILE 
GSVRNSLWRP VPFKCPTC31KKTFS YWEL I PLQVNYSLKGI VEKY 
NKI KISPKMP VCKGH\ LGQPLNI F\CL\ TDMQLDL/CG IC\ATR 
GEHTKHVFCS IEDAYAQERDAFESLFQS FETWRRGDALSRLDTL 
ETSKRKS^LOjTKDSDKVTCEFFEKIjQHTLDQKK^rEILSDFETMK 
LAVMQAYD PE INKLNT I IjQEQRMAFN I ABAFKDVSE P I VPLQQM 
QEFREKIKVI KETPLP PS NLP AS PLMKN FDTSQWED I KLVDVDK 
LSLPQ0TGTFISKI PWSFYKLFLLI LLLGLVI VFGPTMFLEWSI* 
FDDIATWKGCI*SNFSSYI,TKTADFIEQSVFYWEQVTDGFFI FNE 
RFKOTTLVVLNNVAEFVCKYKLL 


6079 

• 


1586 


141 


ATARD LG CARR I DRVVMESTPS RGLNRVHLQCRNIjQEFLGGIjSP 
GVLDRIi YGH P ATCLAVFRELPS LAKNWVMRMLFLEQPLPQAAVA 
LWVKKE FSKAQEE STGLLS GLR I WHTQLLPGGLQGLI LNP I FRQ 
NLRIAIiliGGGKAWSDDTSQICPDKHARDVPSLDKYAEERWEVVlj 
HFMVGS PSAAVSQDIiAQIiLSQAGLMKSTEPGE PPCITSAGFQFL 
LLDTPAQLW Y FMLQYXQTAQSRGMDLVE ILS FLFQLSFSTLGKD 
YSVEGMSDSLI*NFIjQHIjREFGI»VFQRKRKSRRYYPT/RALAiNI* 
SSGVSGAGGTVHQPGFIVXVETOYRIjYAYTESELQIALIALFSE 
MLYPFP \NMW\ARVTR\ESVQQArASGITAQQI IHFLRTRAHP 
VKLKQT? VLP PT I TDQ I RLWEIiERDRIiRFTEG VL YNQFLSQVDF 
ELL \ LAHAP KLG VLVFE /NTPAKRLMVVTPAGHSDVKR FWKRQK 
HSS 


6080 


1 


1199 


IETI DK VGEFAMAAQAAG VS RQRAATQ GLGSN QNALKYLGQDFK 
TLRQQCLDSGVLFKDPEFPACPSALGYKDLGPGSPQTQGIIWKR 
PTELCPSPQFIVGGATRTDICC^LGDCWLLAAIASLTLNEEIiL 
YRWPRDQDFQENYAGI FHFQPLCPPS ?\FWQYGEWVEWI DDR 
I^KNGQLLFLHSEC^NEFWSALLEKAYAKIjNGCYEAIiAGGSTV 
EGFEDFTGGISEFYDLKXPPANLYQI IRKALCAGSLLGCSIDVY 
SAAEAEAI TSQ KLVKSHAYS VTGVE EVNFQGHPEKLj IRLRNPWG 

eveksgawsddapewnhidprrxeeldkkvedgefwmslsdfvr 
qfsrle i cnls pdslsseevhkwnlvlfnghwtrgstaggcqny 
pgss 


6081 


3 


865 


EMLPLLLPLPLLWA/GALAQDARFRIiEMPESVTVQEGLCI FVHC 
SVFYLEYGWKDSTPAYGHWFREGVSVDQETPVATNNSTQKVQKE 
TQGRFHLXiGDPSRNWCSIiS I RDARRRDNG S YFFWARG RTKFS Y 
KYSPLS VYVTALTHRPDILI PE FLKSGHPSNLTCS VPWVCEQGT 
PPIFSWMSAAPTSLGPRTLHSSVLTI I PRPQDHGmLI CQVTFP 
GAGVTTERT IQLS VS WKSGTVEEVWLAVGVVAVKI LLLCLCL I 
I LSFHKKKAVRAVEVEENVYAVMG 


6082 


283 


1288 


EARSPGPTC^TAPGLAAPGIiAQPAALRLLLSRPPSAAPttJGDGD 
PESVGQPEEASPEEQPEEASAEEERPEDQQEEEAAAAA\Y\IiDE 
LPEPIJA/I^VIjAALPRHE\LVQACR\LVCLP^KELVDGAPLWL 
IJC<^EGLVPEGG^EEPJ3HWQQF^FI»SKRRRNLLRNPCGEEDL 
EGWCDVEHG^DGWRVEELPGDSGVEFTHDESVKKYFASSFEWCR 
KAQVIDI*QAEGYWEELLDTTQPAI VVKDWYSGRS DAGCLYETjTV 
KI^SEHEJTVLAEFS SGQVAVPQDSDGGGWME I SHTFTDYGPGVR 
FVRFEHGGQDSVYWKGWFGARVTNSSVVfVEP 


6093 


1865 1 


309 


KQWCAERRGLGMSLADELIADLEEAAEEEEGGSYGEEEEEPAl E 
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SEQ j Predicted 
ID beginning 
NO: j nucleotide 
location 
corre spondi ng 
to first 
amino acid 
residue of 
amino acid 
sequence 



Predicted end" 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



6084 



186S 



309 



6085 



2 



1456 



6086 



2419 



13S7 



6087 



476 



1877 



6088 



1684 



689 



Amino acid segment containing signal peptide 
(A=Alanine, C= Cysteine, D=Aspartic Acid, E«= 
Glutamic Acid, F^Phenyl alanine, G«Glycine, 
H=Histidine, I=Isoleucine, K= Lysine, 
I*= Leucine, M=Methionine , N=Asparagine , 
P=Proline, Q=Glutaraine , R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W-Tryptophan, Y=Tyrosine. X=Unknown . *=stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 



dvqeetqldlsgdsvktiaklwdskmfaeimmkieeyiskqaxa" 
sevmgpveaapey11vtvdannltveienelniihkfirdkyskr 
fpeles lvpnaldy i rtvkeix3ns udkcknnenlqqid'nqatlm 
vvsvtas ttqgqqiis eeelerx»eeacdma1»elnaskhri yeyve 
srmsfiapnlsi i igastaaki mgv7u5gltnlskmpacnxmi*lg 
aqrktlsgfsstsvlphtgyi yhsdivqslppi pppfsvap\d1> 
rrkaari»vaakctiiaarvds fhestegkvgyelkdb ierkfdkw 

QE P P P VKQ VKPLP AP LDGQR K KRGGRR Y PJCMKERLG LTE I R \ KQ 

ANRMSFGEIEEDAYQEDLGFSLGHLGKSGSGRVRQTQVNEATKA 
RI SKTLQRTLQKQS VVYGGKSTI RDRSSGTAS S VAFTPLQGIiE I 
VNPQAAEKKVAEANQKYFSSMAEFLKVKGEKSGLMST 



KQWCAERRGLGMSLADELIiADLEEAAEEEEGGSYGEEEEEPAIE 
DVQEETQl*DliSGDS VKTIAKliWDS KM FAE I HMKI EEYI S KQAKA 
SEVMGPVEAAPEYRVrvnANNLTVEIENELNI IHKFIRDKYSXR 
FP ELESLVPNALDY IRTVKELGNS LDKCKNNENLQQ I LTNATI M 
WSVTASTTQG(X}I^EEELERLEEACDMALELNASKHRIYEYVE 
SRMS FIAPNLS 1 1 IGASTAAKI MGVAGGI/TNLS KMPACNI ML.LG 
AQRKTLSGFSSTSVLPHTG YI YHSDIVQSLPP I PPPFSVAP\DL 
RRKAARLVAAKCTLAARVDSFHESTEGKVGYELKDEIERKFDKW 
QEPPPVKQVKPLPAPLDGQRiCKRGGRRYRKMKERLGI,TEIR\ KQ 
ANRWSFGEI EEDAYQEDIX3FSIX3H1X3KSGSGRVRQTQVNEATKA 
R I S KTLQRTIjQKQ; SWYGGKS T I RDRS SGT AS S VAFT P LQGLE I 
VNPQAAEKKVAEANQKYFS SMAEFliKVKGE KSGLMST 



SGPRS PQGNRAVG RI S LGGKRN PE VTLLPG VSS ER VRR WRRAR V 
GVARVKPGNPWKPSPATQVPR/VPAQVYIjPGRGPPLREGEELVM 
DEEAYVLYKRAQTGAPCLSFDIVRDHIjGDNRTELPLTI>YI»CAGT 
QAESAQSNRiMMLRMHNLHGTKPPPSEGSDEEEEEEDEEDEEER 
KPQLELAMVPHYGGINRVRVSWLGEEPVAGVWSEKGQVEVFAIjR 
RLLQWSEPQALAAFIiRDEQAQMKP I FSFAGHMGEGFAI*DWS PR 
VTGRIjLTGDCQKNIHLWTPTDGGSWHVDQRPPVGHTRSVEDLQW 
SPTENTVFASCSADASIR I WDIRAAPSKACMLTTATAHDGDVNV 
ISWSRREPFLLSGGDIX3AI»KIWDIiRQFKSGSPVATFKQHVAPVT 
S VEWH PQDSGVFAASGADHQ I TQWDLiG / 1 VERDP EAGDVEADPG 
LADLPC^LIjFVHQGETELKELHWHPQCPGLJjVSTALSGFTIFRT 
ISV 



GAATQHGGAMNIJjPCNPHGNGLLYAGFNQDHGCFACGMEKGFRV 
Y>TTDPIiKEKEKQEFIiEGGVGHVE^FRCNYIALVGGGKKFKYPP 
NKVMIWDDLKKKTVIEIEFSTEVKAVKIJ^\DKIVVV1^SMIKV 
FTFTHNP \HQLHVFE \TCYNPKGLCVLCPNSNNSLLAFPGTHTG 
HVQLVDLASTEKP PVDI PAHEGVLSC I ALNLQGTR IATASEKGT 
L I RI FDTSSGHLI QELRRGSQAANT YCI NFNQDASLI CVSSDHG 
TVHIFAAEDPKRNKQSSLASASFX.PKYFSSKWSFSKFQVPSGSP 
CICAFGTEPNAVIAI CADGSYYKFLFNPKGECIRDVYAQFLEMT 
DDKZj 



QNSQRTGLPITIFSRSFPLI/TGSDLCENMPCTCTWRNWRQWIRP 
LVAVT YLVS I WAVPLCVWELQKLEVG IHTKAWFIAG I FLLLTI 
PISLWVILQHI,VHYTQPEIiQKPIIRIl.WMVPIYSLpSWIAX»KYP 
GIAIYVDTCRECYEAYVIYNFMGFLTNYIjTNRYPNLVLILEAKD 
QQKHFPPLCCCPPWAMGEVI4,FRCKI^V1,QYTVVRPFTTIVALI 
CELLGIYDEGNFSFSNAWTYLVI INNMSQLFAMYCLiLLFYKVLK 
EELSPIQPVGKFLCVKLWFVSFTaQAWIALLVKVGVISEKHTW 
EWQTVEAVATGUQDFI ICIEMFI*AAIA\HHYTFS YKPYVQEAEE 
GSCFDSVLAMWDVSD IRDDISEQVRHVGRTVRGHPRKKLF PEDQ 
DQNEHTSLLSSSSQDAIS IASSMPPSPMGHYOGFGHTVTPQTTP 
TTAKISDEILSDTIGEKKEPSDKSVDS 



GASGLVRiliC^HRaiAPVAPKLVPPVRGVKKGFRAAFRFQKE 
LERQRUjRCP PPPVRRS E KPNWDYHAE IQAFGHRLQENFS LDL L 
KTAFVNS C Y I ICS EEAXRQQLG IEKEA VLLKLKS f IQELS EQGTS F 

SQTCLTQFLEDEYPDMPTEGI kni»vdfltgeevvchvarnlave 

QLTLSEEFPVPPAVIX^FFAVIGAIiLQSSGPERTALFIRDFLl 
TQMTGKELFEMWKI INPMGLLVEEIiKKRNVSAPESRLTRQSG\A 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino "acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=*Alanine, C= Cysteine , D=Aspartic Acid, E= 
Glutamic Acid, F=Phenyl alanine, Gs=Glycine, 
H«Hictidine, I**lsoleucine , K= Lysine, 
L«=Leucine, M=Methionine, N=Asparagine , 
P= Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, v=valine, 
W - Tryp t ophan , Y=Tyrosine, XsUnknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








PTALPLY FVGIiYCDKKLIAEG PGET VLVAEEEAAR VALRKL YG F 
TENRRPWNYS KPKETLRAEKS ITAS 


6089 


3 

• 


3054 


TRIjGIPGSTISSRPRIjCAIAAEGHFTjGHSV^GSRAGAUTGAPAW 
PSRRLRDLPAGGMWRLRRAAVACEVCQS LVKHSSGI KGSLPLQK 
LHLVSRS IYHSHHPTLKLQRPQLRTSFQQFSSLTNIjPLRKLKFS 
P I KYG YQ PRRNFW P ARIiATRLLKLR YL I IXSSAVGGG YTA XKT FD 
QWKDMIPDLSEYKWIVPDIVWEIDEYXDFEKIRKAIiPSSEDL,VK 
LAPDFDKIVESLSLIiKDFFTSGSPEETAFRATDRGSESDKHFRJC 
VS DKEKI DQU2EELLHTQLKYQR I LERLE KENKEJLRKLVIjQKDD 
KGIPFIESLRKSIilDMYSEVLDVliSDYnASYNTQDHLPRVVVVG 
DQSAGKTSVTiEMXAQARI FPRGSGEMMTRS PVKVTLSEGPHHVA 
LFKDSSRE FDLTKEEDLAWLRHE I ELRMRKNVKEGCTVS PET I S 
LNVKGPGLQRMVLVDIJJGVINTVTSGMAPDTKETIFS I SKAYMQ 
DPNAIILCIQIX3SVDAERSIVTDLVSO>!DPHGRRTIFVI>TKVDL 
AEKNVAS PSRIQQI I EGKLFPMKALG YFAWTGKGNSSES I EAI 
RE YEEEFFQNS K1aLKTSMIJ<AHQVTTRNLSIjAVSDCFWKMVRES 
VEQQADSFKATRFNLETEWKNNYPRLRELDRNELFEKAKNEILD 
EVISI^QVTPKHWEEIUXJSLWERVSTHVIENIYLPAAQTMNSG 
TFNTTVDIKLKQWTDKQLPNKAVEVAWETLQEEFSRFMTEPKGK 
EHDDI FDKLKEAVKEES I KRHKWND FAEDSLR VI QHNALEDRS I 
SDKO^WEDAAI YFMEEALQARLKDTF-NAIENMVGPD\WKKRVf LiYW 
KNRTQEQCVHNETKNELE KMLKCNEEHP AYLAS DE ITT VRKNLiE 
SRGVEVDPSLIKDTWHQVYRRHFIiKTAIiNHCNLCRRGFYYYQRH 
FVDSELECNDVVIiFWRIQRMLAITANTL>RG^LTNTEVRRIiEKNV 
KEVLEDFAEDGEKKI KLLTGKRVQLAEDLKKVREI QEKLDAFIE 
ALHQEK 


6090 


194 


1560 


PVFVPAPGAVLEQAS /ASPPLATQTWPLQHCKIPEJUPVQAS IL 
FELQLFFCQLI ALFVHYINI YKTVWWYPPSHPPSHTSIiNFKLID 
FNLLMVTTI VLGRRFIGS IVKEASQRGKVSLFRS I L LFLTR FTV 
LTATG W3 LCRS L I HLFRT YS FLNI^L/FPLLSVWDVHSVPAAEIiR 
P\RKTSLFNHMASMGPRBAVSGIiAKSRDYL»LTLR \RRGSSTQDS 
C^RTPCP/PIIACC^PSIiIRSEVEFLK>IDFlIWRMKEV3UVSSML 
SAYYVAFVPVWFVKNTHYYDKRWSCELFLLVSI STS VILMQHLL 
PASYODLLHKAAAHI^CWQKVDPALCSNVLQHPW-rEECMW 
LVKKSKNVYKAVGHYNVAIPSDVSHFRFHFFFSKPIJirLNILJuL. 
LEGAVI VYQLYSLMSSEKWHQTI SIAIrl LFSNYYAFFKLLRDRL 
VLGKAYS YSAS PQRDLDHRPS 


6091 


3279 


412 

i 


SSRTREMEEKEILRRQIRLLQGLIDDYKTDHGNAPAPGTPAASG 
WQPPTYHSGRAFSARYPRPSRRGYSSHHGPSWRKKYSLVNRPPG 
PSDFPADHAVRPLHGARGGQPPVPQQHVLERQVQLSQGQNVVIK 
VKPPSKSGSASASGAQRGSIiEEFEDTPWSDQRPREGEGEPPRGQ 
LQPSRPTRARGTCSVEDPIiLVCQKEPGKPRMVKSVGSVGDSPRE 
PRRTVSESVIAVKASFPSSALPPRTGVALGRKIjGSHSVASCAPQ 
LIiGDRRVDAGHTDQPVPSGSVGG PARP AS G P RQAREAS LWTCR 
TNKFRKNNYKWVAAS S KS PRVARRALS PRVAAENVCKAS AGMAN 
KVEKPQLIADPEPKPRKPATSSKPG3APSKYKWKASSPSASSSS 
SFRWQSEAGSKDHASQIjSPVLSRSPSGDXRPAVGHSGLKPLSGE 
TPLSAYKVKSRTKI IRRRGSTSLPGDKKSGTS PAATAKSHLSLR 
RRQAI^GKSSPVLKKTPNKGLVQVTTHRLCRLPPSRAHI>PTKEA 
SSLHAVRTAPTSKVIiCTRYRIVKKTPASPLSAPPFPLSLPSWRA 
RRLSLSRS LVLNRLRPVASGGGKAQPGS PWWRS KGYRC IGGVI/Y 
KVSANKI»S KTSGQP S DAGS RPLLRTG PLD P AGS CSR S LASRA VQ 
RSLAI IRQARQRREKRKEYOIYYNRFGRCNRGERCPYIHDPEKV 
AVCTRFVRGTCKKTDG TCPFSHHVS KEKMPVCS YFTJCG I CSNSN 
CPYSHVYVSRKAEVCSDFUKGYCPIjGAKCKKKHTIjLCPDFARRG 
ACPRGAQCQIiLHRTQKRHSRRAATSPAPGPSDATARSRVSASHG 
PRXP5ASQRPTRQTPSSAALTAAAVAAPPHCPGGSASPSSS kas 
SSSSSSSSP PASIJ3HEAPSLQEAALAAACSNRLCKLPS FISLQS 
SPS PGAQPRVRAPRAPIiTKDSGXPLHI KPRL 


6092 


143 


3190 


AKAPPTGES S EP EAKVIiHTKRL YRAVVEAVHRI»DI»II^CNKTAYQ 
EVFXPEWISiRNKIiREIiCVKIWFZ^PVDYGRKAEELL 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corre sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corre sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A^Alanine, C=Cysteine, D^Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=G lycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
LaLeucine, M^Methionine, N^Asparagine , 
P=Proline, Q=<31utamine, RsArginine, 
S=Serine, T=Threonine . V=Valine, 
W -Tryptophan, Y^Tyrosine, X=Un)cnown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 




- 




VlQI* I KTNKKH IHSRSTIJ2CAYRTHLVAG IG F YQHLLL Y IQSHY 
QLEMCCIDWTHVTDPLIGCKOTVSASGKEt^WAQMACHRCLVY 
LGDI^RYQNELAGVDTELIAERFYYQAIiSVAPQIGMPFNQLGTL 
AGS KYYNVEAMY CYLRC IQSEVS FEGAYGN3jKRI*YDKAAKMYHQ 
LKKCETRKI^PGKKRCKDIKRLLVNFMYLQSLLQPKSSSVDSEL 
TSLCQSVLEDFNLCLFYLPSSPNLSIaASEDEEEYESGYAFLPDI, 
LIFQMVlICI^CVHSXiERAGSKQYSAAIAFTIiAIiFSHLVNHVNI 
RIiQAELEEGENPVPAFQSDGTDEPESKEPVEKEEEPDPEPPPVT 
PQVGEGRKSRKFSRLSCIiRRRRHPPKVGDDSPIfSEGFESDSSHD 
SARASEGSDSGSDKSLEGGGTAFDAETDSEMNSQESRSDLEDME 
EEEGTRS PTLEPPRGRSEAPDSLNGPIiGPSEAS I ASNIjQAMSTQ 
MFQTKRCPRIAPTFSNIJlJ^PTTNPHTSASHRPCVNGDVDKPSE 
PASEEGSESEGSESSGRSCRNERS I QEKLQVLMAEGIjLPAVKVF 
LDWLRTNPDL.I IVCAQSSQSLWNRliSVLLNL.L.PAAGEL.QESGLA 
LCPEVQDIJJEGCELPDLPSSLLLPEDMALRNI.PPLRAAHRRFNF 

dtdrpllstleeswriccirsfghfiarlqgs ILQFNPEVGIF 

VS IAQS EQESLLQQAQAQFRMAQEEARRNRLrMRDMAQLRLQLEV 
SQLEGS LQQPKAQSAMSP YI»VPDTQAI*CHHLi P VI RQLATSGRF I 
VI I PRT V I DGIiDLLKKKHPGARDGI R YLEAE FKKGNR YIRCQKE 
VGKSFERHKLKRQDADAWTL YR I I*DS CKQLT \ LAQGAGEEDP SG 
MVT I ITGLPLDNPSLLSGPMQAALQAAAHASVDI KNVLDFYKQW 

KEIG 


6093 


76 


1002 

■ 


ATORRAMIALRVART/SRWGAL \RGAVWAPGTRPS KRRACWAL1* 
PPVPCCLGCliAERWRIiRPAAIjGIiRIiPGIGQRNHCSGAGKAAPR\ 
PAAGAGAAAEAPGGQWGPASTPSLYENPWTI PNMLSMTRIGLAP 
VXGYLI IEEDFNIALGVFALAGbTDIjLDGFIARNWANQRSALGS 
ALDPIADKILISII*YVSI,TY7^DI>IPVPLTYMI ISRDVMLIAAVF 
YVRYRTIjPTPRTLAKYFNPCYATARIiKPTFI s kvntavqlilva 
ASUVAP VFNYADS I YLQI I*WCFTAFTTAAS AYSY YHYGRKTVQV 
IKD 


6094 


23 


ldio 


WRLxMAP FNMRCKTCGE YI YKG KKFNARKETVQNEVYIiGLP I FR 
FYI KCTRCLAEITFKTDPENTDYTMEHGATRNFQAEKLbEEEEK 
RVQKEREDEEIiNNPMKVI*FJTRTKDSKI»EMEVLENLQEIiKDIiNQR 
QAHVDFEAMIiRQHRbSEBERRRQQQBEDEQETAAIiLEEARKRRL 
XiEDSDS EDEAAPS PLQPAIiRPNPTAIItDEAPKPKRKVEVWEQSV 
GSI*GSRPPIjSRIiWVKKAKADPDCSNGQPQA/ APHPRSPAEQEG 
GQP YTPDAWRVIiPEPTGCI PGQ 


6095 


1 


1599 


TRGRAAERS RGRGHG FIjGGGFA \ S WD Y F P S EDF Y RCG YCKNES 
GSRSNGMWAHSMTVQDYQDL IDRG WRRSG KYVYKP VMNQTCCPQ 

ytircrplqfqpskshkkvlkkmlkfiakgevpkgsceXdepmd 

STMDDAVAGDFAL INKLDIQCDIiKTLSDDI KESLESEGKNSKKE 
EPQEIiLQSQDFVGEKLGSGEPSHS 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corxe sponding 
co zirsc 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding . 
to first 
amino acxa 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A= Alanine, C=Cysteine, D= As par tic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K^Lysine, 
L'=Leueine r M=Methionine, N-Asparagine , 
r=/roiine, u=GJLutamane, R=Arginine , 
S=Serine, T=Threonine, VWValine, 
W-Tryptophan, Y=Tyrosine, X=Un)cnown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








VKVHTVPKPGKGADLSKPPCRKAKEIRKERKRLKLMQQNPAGEL 
EGFQAQGHPPSLFPPKAKSNQPKSLEDLIFESLPENASHKLEVR 
VVRSSPPSSQFIC\TLLESYQVYKRYCWIHKNPPDTPTESQFTR 
FLCS S PLEAETP PNG PDCG YG S FHQQ YWZiDG KI I AVG VT D I LPN 
CVSSVYLYYDPDYSFLSLGVYSALREIAFTRQLiHEKTSQI>SYYY 
MGFYIHSCPKMKYKGQYRPSDLLCPETYVWPIEQCLPSLENSK 
YCRFNQDPEAVDEDRSTEPDRLQVFHKRA I M PYGWKKQQKDPS 
EEAAVLQYAS LVGQKCS ERMLLFRN 


6096 


2277 


g— n r- 

575 


QRVRAALLSS AMEDSEALG FEHMGLDPRLLQAVTDLG W SR PTLI 
QEKAI PLALEGKDIJuARARTGSGKTAAYAIPMLQIiblJIRKATGP 
WEQAVRGLVLVPTKELARQAQSMI QQLATY CARDVRVANVS AA 
ED S VS QRAVLME KPD VWGTPSRI LSHLQQDS L.KLRDS L»E LLW 
EEADliliFSFGFEEELKSI^CHLPRIYG^FlJ^SATFTSIEDVQAIiKE 
LILHNPVTLKLQESQLPGPDQLQQFQVVCETEEDKFLLLYALLK 
LSLI RGKSLL FVNTLERS YRL»RIjFI*EQFS IPTCVLNGELPLRSR 
CHI I SQFNQGFYDCVIATDAEVLGAPVKGKRRGRGPKGDKASDP 
EAGVARGIDFHHVSAVIjNFDLPPTPEAYIHRAGRTARANNPGIV 
LTFVLPTEQFHI/GiaEELLSGENRGPIL^YQFRMEEIEGFRYR 
CRDAMRS VTKQA I REARLKE I KEEL LH S EKLXTYFEDNPR \ DLQ 
LLRHDLPI^PAVVKPHIX3HVPDYIjVPPAIA^ 
PLVGRPREQSPRTHCAASSTKERKSDPQPSPPEWGPLWS 


6 097 


1673 


192 


APGTMSGGKKKSS FQITSVTTDYEGPGS PGASDPPTPQPPTGPP 
PRLPNGEPSPDPGGKGTPRNGSPPPGAPSSRFRVVKLPHGLGEP 
YRRGRWTCVDVYERDLEPHSFGGLLEG I RGASGGAGGRS LDS RL 
ELAS LGLGAPT P PSGLS QG PTSWLRP P P TSPG PQARS FTGGLGQ | 
L WPS KAKAEKP PLS ASS PQQRPPEP ETGESAGTS RAATPLPS L 
RVEAJBAGGSGARTPPI^RRKAVDMRIiR14ELGAPEEMGQVPPLDS 
RPS S PALY FTHDAS LVHKS PD P FGAVAAQKFS LAH S M LA ISGHL 
DSDDDS GSGSLVG I DNKI EQAMDLVKSHLMFAVRE EVEVLKEQI 
RELAERNAAI£QENGLIiRAlA\SPEQLGSAGPPRGVPR\LGPPA 
PiraPFVLSLPSLTIVPLGLPGIASAAWPPLPMPALIVPVFPGVG 
VQALSNGP WSPGP LPHLL HPS LDGGGEG FRTGRQQGAP FGEET 
QPPPSLPGTPQQ 


6098 


168 


1074 


NYCI^HRSPLEKDSSPGSSSTSIJjIKKQRETSDTPIMRAIiKEIiD 
bisKJL r KNWGTQTEKEDTSNINPRQTETSVNASKoPEKCAQQ 
RLNSAS QRS SSLP P SNRKS STPTKRE I MLTP VTVA YS P KRS P KE 
I^PGFSHLI^KNESSPIRFDILI^DLDTVPVSTLQRTNPRKQL 
\ QFLPLDDS EEK\ T YSEKAT \DNI VNHS SCPBP VPNG VKKVSVR 
TAWEKNKS VS YEQCKPVSVTPQGNDFB YTAKIRTLAETERFF\D 
ELTKEKDQI EAALSRMPSPGGR ITLQTRLNQEAFGRS FGKD 


6099 


■kQO 




MVfT.DUO C TiT DirnOCDfteeCTQT T T WADUTCTVPD TMDTV T Wl T*» 

ft X LLuullv a rliCttvi/oo rbboo 1 o Job 1 KAUKc 1 oL/1 r iriKALiAbLD 

EGKIFKNWGTOTBKEDTSNINPRQTCTSVNASRSPEKCAQQRQK 
RLNSASQRSSSLPPSNRKSSTPTKREIMLTPVTVAYSPKRSPKE 
NLS PGFSHLLSKNESS PI RFDILLDDLDTVPVSTLQRTNPRKQL 
\QFLPLDDS EEK\TYS EKAT\DNI VNHS SCPEPVPNGVKKVSVR 
TAW2KNKSVSYEQCKPVS\m>QGNDFEYTAKIRTLAETERFP\D 
ELTKEKDQ I EAALSRMPS PGGR ITLQTRLNQEAFGRS FGKD 


6100 


2 


713 


FVEV3GYRSRADPEPRGRDTMTYAYLFKYI I IGDTGVGKSCLLL 
QFTDKRFQPVHDLTIGVEFGARMVNIDGKQIKLQIWDTAGQESF 
RSITRSYYRGAAGAX.LVYDITRRETFNHLTSWLEDARQHSSSNM 
VIML I GNKSDL E S RR D VKREEGEAFARE \ HGLI FMETS AICTACN 
VEEAFINTAKE I YRKIQQGLFDVHNEANG I KIG PQQS ISTSVGP 
SASQRNSRD IGSNSGCC 


6101 


1 

• 


1399 


FRGRAWPLREVSHWLGCRRVCSWSAS WGRLPALSARL5 PLLAFR 
G^lVFPIiSCAVQQYAWGKMGSNSEVARLLASSDPLAQIAEDKPY 
AELWMGTHPRGDAKI LDNR ISQKTLS QWIAENQDS LGSKVKDTF 
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SEQ 
ID 
HO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A= Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid. F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K^hys±ne t 
L=l*eucine, M=Methion±ne, N=Asparagine, 
P=?roline , 0=Glutamine . R=Ar<iinine . 
S=Serine, ^Threonine, V«=Valine, 
W=Tryptophan, Y=Tyrosine, X=Dnknown # *=Stop 
Codon, /=possible nucleotide deletion. 
\=possible nucleotide insertion) 








NGNLPFL FKVIjS VETP LS I QAH PN KELAEKLHLiQAPQH YP DANH 
KPEMAIALTPFQGLCGFT^PVEEIVTFLKKVPEFQFLIGDEAATH 
LK QTMS HDS QAVTiSSLQS CFS HLM KSBKKVVVEQIMLLVKR ISQ 
QAAAGNNMEDI FGEI*X*LQLHQQYPGDIGCFAI YFLNIiLTLKPGE 
AM FIiEANVP HAYIiKGDCVECMACS DNTVRAG LTP KF I D VPTL CE 
MLS YTPSS S KDRLFLP TRSQED P Yl>S I YDP PVPDFTIMKA\ EVP 
G\SVTEYKDLALDSAS ILIjMVQGTVXASTPTTQTPI PLQRGGVL 
F I GANESVS LKLTE PKDLL.I FRACCLL 


6102 


70 


2415 


QTPQATLAANGAEDSRGGEMLPAG2IGASPAAPCCSESGDERKN 
LEEKSDINVTVLIGSKQVSEGTDNGDLPSYVSAFIEKEVGNDL.K 
S LKKLDKL I EQRTVS KMQLEEQVLTI SSEI PKRI RSALKNAE ES 
KQFLNQFLEQETHIiFSAINSHLIiTAQP WMDDLGTMISQ I EE IER 
HLAYLKWISQ IEELSDNI QQYLMTNNVP EAAS TLVSMAELD I KL 

P PQSQTVG LS R PASAPE I YS Y1*ETL FCQLLKLQTS HELLTE P K\ 
HSQKNTLFLPPLI^S/WPIQVMI J T?IJ□KFJ^^YHFRGNRQT^^^ 
KPEWYLAQVIiMWIGNHTEFIjDEKIQPILDKVGSLVNARLE 
LMMLVXiEKIATDI PCLLYDDNI>FCHL,VDEVliIjFERELHSVHGYP 
GTFAS cmhi t^EETCFQRWLTVERKFALQKMDSMLSSEAAWVSQ 
YKDITDVDEMKVPDCAETFMTLLLVI TDRYKNLPTASRKLQFLE 
LQKDL VDDFR IRLTQ VMKEETRAS LG FR Y CAI LNA VNY I STVLA 
DWADNVFFLQLQQAALEVFAENNTLS KLQLGQ1ASMES SVFDDM 
INLLERLKHDMLTRQVDHVFREVKJ5AAKLYKKERWLSLPSQSEQ 
AVMSLSSSACPLIJLTLRDHIjIjQLEQQL»CFSLEKIFWQMLVEKLD 
VYIYQEI IIJVNHFNEGGAAQIjQ FDMTRNLFP LFS HYCKR PENYF 
KHIKEACIVLNLNVGSALTAGKDVLPVQLQGSFPAT 1 


6103 


207 


2523 


ESNSTMTTYIiEFIC^NEERDGVRFSWNVWPSSRLEATRMVVPVA 
ALFTPLKERPDLPPIQYEPVIjCSRTTCRAVXNPIjCQVDYRAKXiW 
ACNFCYQRNQFPPSYAGISELNQPAELIiPQFSSIEYWLRGPQM 
PIiI FL YVVDTCMEDEDLQ ALKES MQMS LSLL P PTAL VGL I TFGR 
MVQ VHELGCEG I S KS YVFRGTKDI>S AKQLQEMLGLSKVPVTQAT 

PLRSSGVALS I AVGLLECTFPNTGAR I MMFIGG PATQGPGMWG 
DELKTP I RS WHDI DKDNAKYVKKG TKH FEALANRAATTGHVID I 
YACLALDQTGLLEWKCC PNL. TGGYMVMGD S FNT S LF KQT FQRVFT 
KDMHGQFKMGFGGTIiEIKTPR\EIKISGAIGPCVSLNSKGPCVS 
ENE I GTGGT CO WKI CGLS PTTTLA I YFEWNOHNAP I POGG \ RG 
A\ I Q FVTQY \ QHSSGQRR I RVTT I ARN\ WADAQTQI QNI AAS FD 
QEAAA I LMARIiAI YRAETEEGPDVLR WLDRQi I RXiCQKFGEYHK 
DDPSSFRFSETFSLiYPQFMFHL»RRSS FLQVFNNSPDESS YYRHH 
FMRQDLTQSLIMIQPILYAYSFSGPPEPVLLDSSSILADRIIiLM 
DTFFQILI YHGETIAQWRKSGYQDMPE YENFRHLJLQAPVDDAQE 
I tiHSRFP M PRY I DTEHGG S QARFLLS KVN PS QTHNNMY AWGQ ES 
GAP I LTDDVSLQVFMDHLKKLAVSS AA 


6104 


124 


732 


kvseyi i lskdki lfhaiiamlvlvvs pwsaargvlrn ywerllr 
klpqsrpgfps ppwgpalavq\aqpclqsqqmi pvevkri /rsl 
ldsifwmaapknrrtievnrcrrrnpqklikvknnidvcpecgh 
lkqkhvi*caycye kvcketae i rrq xgkqeog pfkaptietwl 
ytgetpseqdqgkri ierdrkrpswftqn 


6105 


3 


989 


P LHGACTS LVLQRFCHRRP RP CAPARPEDMRR P AAVPLLIjLLCF 
GSQRAKAATACGR PRMLNR WVGGQDTQEG EWPWQVS IQRNGSHF \ 
CGGSL I AEQ WVLTAAHCFRNTS ETS LYQVLLGARQLVQPGPHAM 
YARVRQVESNPLYO/3TASSADVALVELEAPVPFTNYILPVCLPD 
PSVIFETGMNCWVTGWGSPSEEDLLPEPRILQKLAVPI IDT\ PR 
CNUiYSKOTEFGYQPKTIKNDMLCAGFEEGKKDACKGDSAGPLV 
CLVGQSWLQAGVIS WGEGCARQNRPGVYIRVTAHHNWIHRI I PK 
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to first 

amino acid 
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sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C^Cysteine, b=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I^Isoleucine, K= Lysine, 
L=Leucine, M=Methionine , N=Asparagine . 
P= Proline , Q=Glut amine. R=Arginine, 
S =Serine , T=Threonine , V=Val ine , 
w= Tryptophan, Y=Tyrosine, X=Unknown, +=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








LQVQPSEVGRPEVTPPGPGAP 


6106 


3 


1302 

l 
i 

1 


GRPPTAPHTGRPPTANRGDPRI>DLKKGCARI^TSIESRGRPAAS 
AGLRRDRaUjRRWPLRRAPLARATRRRAGS PRRCAPRPRACPQG 
WSRARHC PGGLCLLIiIiLljCQFMEDRS AQAGN CWLRQAKNGRCQV 
LyKTELSKEECCSTGRI^TSVPTEEDVNDNTLFKWMI FNGGAPNC 
I PCKETCENVIX2GPGKKCRMNKKNKPRCV CAPDCSN I TWKGPVC 
GIJXJKTYRNECALLiKARCKEQPELEVQYCGRCKKTCRDVFCPGS 
STCV \ VDQTNNAYCVTCNR I CPE PAS S EQYLCGNDGVTY S \ SAC 
HLRKATCLLGRS IGLAYEGKCIKAKSCEDIQCTGGKKCLWDFKV 
GRGRCS LCDELCPDS KS DE PVCAS DNAT YAS ECAMKEAACSSGV 
LLEVKHSGSCNS I SEDTEEEEEDEDQDYSFPISSILEW 


6X07 


623 


168 


SRCSSPRPEPGRGRGK/LSPSEHRKWVEVFKACDEDHKGYLSRE 
DFKTAWMIiFG YKPSKI EVDS VMS S INPNTSG I LUEGFLNI VRK 
KKEAQRYRNEVRHI FTAFOTYYRGFLTIiEDFKKAFRQVAPKTiPE 
RTVLEVFREV\ DRDS\DGHVS F 


6108 


3 


1348 


GGSLRFSPPRVPS.CSRVFCPVPPGGCGLPSPMSASRPQSPTTPW j 
OjPRRYMKHKRDDGPEKQEDEAVDVTPVMTCVFVVMCCSMLVIJ-, 1 
YYFYDLLVYWIGIFCTASATGLYSCIiAPCVRRLP\SASAGESA j 
LLAPTI PNNSLPYFHKRPQARMT .ItTiAT.FCVAVSVVWGVFRNEDQ 
WAWVLQDALG IAFCL»YMI*KT I RLPTF KACTLLLLVLFL YDI FFV 
FI TP FLTKSGS S I MVEVATGPSD SATREKLPM VLKVPRLNS S PL> 
AL<2DRPFSLLGFGDII»VPGIiIjVAYC31^FDIQVQSSRVYFVACTI 
AYGVGLLVTFVAIJu^QRGQPALDYIjVPCTIjVTSCAVALWRREL 
GVFWTGSGFAKVLPPS PWAPAPADGPQPFKDSATP LSPQPPSEE 
PATS PWPAEQS PKSRTSEEMGAGAPMREPGSPAES EGRDQAQPS 
PVTQPGASA 


6109 


1 


1381 

• 


CRS RAGAASGGAI LEGTKLRRQRVDTNKPLDPLVPSALRAAMLY 
LEDYLEMI EQLPMDLPJDRFTEMROEMDI>QVQNAMIXJI*EQRVS E FF 
MNAKKNKPEWREEQMASIKKDY YKALEDADEKVQLANQI YDLVD 
RHLRKLDQEIAKFKhJEDEADNAGITEIXiERRSLELDTPSQPVNN 
HHAHSHTPVEKPJCYNPTSHHTTTDHIPEKKFKSEALljSTLTSDA 
SKENTLGCRNNNS TASSNNAYltfVNS S QPLGS YN IGSLSSGTGAG 
GI \ TMAAA0 AVQATAQ MKEGRRTS S L KAS YEAFKNND FQLG KE F 
SMARETVGYSSSSAIiMTTLTQNASS SAABS RSGRKSKNNNKSSS 
QQSSSSSSSSSLSSGSSSSTWQEISQQTTWPESDSNSQVDWT 
YDPNEPRYCIC^QVSYGEIWGCDTQDCPIEWFHYGCVGLTEAPK 
G KWYCPQCT\ AAMKRRGSRHK 


6110 


77 


2464 


ACPSAATMSDQDHSMDEMTAWKIEKGVGGNNGGNGNGGGAFSQ 
ARSSSTGSSSSTGGGGQESQPS PLAIJ AATCSR I ES PNENSNNS 
QGPSQSGGTGELDLTATQLSQGANGWQ I ISSSSGATPTSKEQSG 
S S TNG S NG S ESS KNRTVS GGQ YWAAA PNLrQNQQ VLTG LPG VM P 
NIQYQVI PQFQTVDGQQLQFAATGAQ VQQDGSGQ I QI I PGANQQ 
I ITNRGSGGNI XAAMPNLLQQAVPLQGLANNVLSGQTQ YVTNVP 
VAIiNGNITLLPVNSVSAATLTPSSQAVTISSSGSQESGSQPVTS 
GTTISSASLVSSQASSSSFFTNANSYSTTTTTSNMG I MNFTTSG 
SSGTNSOGQTPQRVSGLCK3SDALNIQQNQTSGGSLQAGQQKEGE 
Q\NQQTQAAP KSI ^SR P QLVQGG\ QALQ \ AFQAAPLSGQTFTTQA 
ISQETLQNLQLQAVPNSGPI I IRTPTVGPNGQVSWQTLQLQNLQ 
VQNPQAQ T I TtiAPMQ GV S LGQTS S SNTTLTP I ASAAS I PAGTVT 
VNAAQLSSMPGIiQTINLSAIX^SGIQVHPIG^LPIAIANAPGDH 
GAQLGIiHGAGGDG IHDDTAGGEEGENS PDAQPQAGRRTR REACT 
CPYCKDS EGRGSGDPGKKKOH I CH I QGCG KVYGKTSHLRAHLRV7 
HTGERPFWCTWS YCGKRFTRS DELQRHKRTHTGEKKFACPECPK 
RFMRSDHLS KHI KTHQNKKGGPGVALS VGTLPLDSGAGS EGSGT 
ATPSALITTNWAMEAICPEGIAPJ^ANSGINVKEGGQFCSPINT 

SANGF 
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ID 

NO: 


Predicted 

beginning 
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correspouoing 
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co urst 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
co nrsc 
amino acia 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
{A=» Alanine, C=Cysteine , D=sAspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G=CIycine, 
H=Histidine, I*Isoleucine f K= Lysine, 
L= Leu cine , M=Methionine, N=Asparagine , 
F-iroiine, Q=Giutanu_ne , R=Arginine , 
S= Serine , T=Threonine, V= Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, +=Stop 
Codon, /-possible nucleotide deletion, 
\ ^possible nucleotide insertion) 


6111 


1637 


797 


j RVDPRVRGAMAP WGKRLAGVRGVLLDI SGVLYDSGAGGG TAI AG 
S VEAVARLKRSRLKVRFCTNESQKSRAELVGQLQRIjGFDI SEQE 
VTAPAPAACQIliKERGIiRPYIjLIHDGV\ASEFDQIDTS/STPNC 
VVIADAGESFSYQNMNNAFQVLMEI^KPVLISLGK^^ 
LMLDVGPYMKALEYACGIKAEVGGKPSPEFFKSALQAIGVEAHQ 
AVMIGDDrVGDVGGAQRCGMRALQWTGKFRPSDEHHPEVKADG 
YVDNLAEAVDLLLQRRDK 


6112 


77 


196 


MSSHKSFKSKRFIiAKKOKPNRPILOWIWLKTGNKIRHNWK ! 


6113 


1779 


567 


WEG RS W AACGVXJL»QGAWG ERSG VRAS EAES PGKRADVS W WS RQL ( 
ETMVDHIANTEINSQRIAAVESCFGASGQPIiALPGRVLLGEGVL 
TKECRKKAKPR I FFLFNDI LVYGS IVLNKRKYRSQHI I PLEEVT 
LELLP ETLQAKNR WM I KTAKKS FWSAASATERQEWISHIEECV 
RRQLRATGRPA\ STEHAAP W I PDKATD I CMRCTQTRFSALTRRH 
HCRKCRVWCAECSRQRFLLPRLSPKPVRVCSLCYRELAAQQRK 
EEAE E QGAGVPRAASHLAR P I CGR PVEMTMTPTRTRRAAGTATG 
PAAWSSTPRGWPGLPSTADPRPAEHLS PSQLHCPGPQEGSSRS C 
PGLRDPI PWKQVQRWGVALSGLPVPFCWTLCPYGFTAGKAFPFR 
KPQNTHRSW 


6114 


818 


246 


PTSRPRPSPGSPAMSWSACVSAAPSSSWPASSSWPCGPRRCCTR 
RRRCSPRCGLAAGSMCSCSPSWRCTPVPACWPSPPP\PAEQVQC 
GHLPPHADRRALRLP VAAP ARG PGPGHPAGPAG PRPARTP PAS P 
KGPGRPTVPAP P C PLLAATEPTPSR PHQRWTREDRMLGRG SQ VT 
GRPQWFLRGLVLFSL 


j 6115 


324 


71 


DVCGRVCAHPHLYTHI HMHI CAHAC\I HTHAQLC/ 1 TASHALAH j 
SHLYTCMVMLTASHT PSHTHPHTAVHKEHRADVLRGTLTPLR 


6116 


595 


1430 


TGVMPPGRWHAA/ 1 SSSGPVFEGARA\LOTVKKBEEDESYTPVQ 
AARPQTLNRPGQELFRQLFRQLRYHES SGPLETLSRLRELCRWW 
IJU>DVLSKAQILEl*LVIiEQFLSILPGELRVWVQLHNPESGEE\L 
WPCWRS CRGTLMGHPGGTRALP \ EPRCALDGYRS \LRSAQI WS L» 
ASPLRSSSAXjGDHIiEPPYEIEARDFIiAGQSDTPAAQMPAIiFPRE 
GCPGDQVTPTRSLTAQLQETMTFKDVEVTFSQDEWGWLDSAQRN 
LYRDVMLENYRNMASIiGK 


6117 


1433 


222 


vgvps pappcswbvgpgggwtpg i lkegqggrrtpllllatrtr 
gllslfppaamhpaafplpvwaavlwgaaptrgliratsdhna 
smdfadi*pai*fgatls qegiiqgflveahpdnacs p i ap ? ppapv 

NGSVF I ALLiRRFDCNFDLKVLNAQKAG YGAAVVHNVNSNET iT«NM 
VWNSEBIQQQTWI PS VFIGERS S E YLRAL FVYEKGAR VLLVPDN 
TFPLGYYLI PFTGI VGLLVLAMGAVMIARCIQHRKRLQRNRLTK 
\EQLKQI \ PTHDYQKGDQYDVCAI CLDEYEDGDKLRVLPCAHAY 
HSRCVDPMLTQTRKTCPICKQPVHRGPGDEDQEEETQGQEEGDB 
GEPRDHPAS ERTPLLGSSPTLPTS FGS LAPAPLVFPG P S TDP P L 
SPPSSPVILV 


»J -L O 


X U *i *A 




KEKEKETQKEKIGEKGREEKVKRKEVEQKI KQEKQEKQERRKGK 
EKEEKRTKQGKETNKEKEQFKGQEEKGENKDSTLTRTPLEPLEK 
NKQILVIX5IJX5AGKTSVU1SIJ^NRVQHSVAPTQGFHAVCINTE 
DS OMEFLB 1GG S KPFRS YWEMYXSN / ADS LARS FSVGFKQDSQP 
ITWKAKKYLHQL I AANP VLP LVVFANKQDLEAA YH ITD IHEALA 
II 


6119 


1217 


462 


DPR FVTENTTKAPAQERTTQPRSS REGTLRSTME YLS ALN PSDL 
LRS VSN IS SEFGRR VWTSAP PPQRP FRVCDHKRTIRKGLTAATR 
QELIJVKALETLLLNGVLTLVLEEDGTAVDSBDFFQI^EDDTCLM 
VIjQSGQSWSPTP^GVLSYGLGRERPKHSKDIARFTFDVYKQNPR 
DLFGSLNVICATFYGXYSMSCT>FCX3L\GPKKVLI^IiI^ 
GLGHMLLG I SSTLRHAVEGAEQWQQKGRIiHS Y 
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3D 

NO: 


Predicted 
beginning 
nucleotide 
location 
corre spond i ng 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
<A=Alanine, C-Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine r 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutaraine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W= Tryptophan, Y=Tyrosine, X= Unknown , *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 


6120 


785 


179 


LERAGGGGLSSRAIiVGSGACI*SLVARANG KGLPRGRKE FVEAVR 
VRYVAFRYRTPRAV(niRI>WSCRREVlMSGRGKQGGKVRAKAKSR 
SSRAGI^FPVGRVHRIiLRKGNYAERVGAGAPVYIiAAVLEYLTAE 
II^LAGNAARDNKKTRIIPRHI^IAIRNDEEI^KLI^KVTIAQG 
G \ VLPN I QAVLL P KKTESQ KDEGAND P 


6121 


1612 


107 


FVRAQARGSRQPVRRPI*LGAGSRLRCRS cgrmeplxvekfatan 
RGNGLRAVTPLRPGELLFRSDPLA YTVCKGSRGVV CDRCLLGKE 
KLMRCSQCRVAKYCSAKCQKKAWPDHKRECKCIiKS CKPRYPPDS 
VRLl^RWFKLMDGAPSESEKLYSFYDLESNINKLTEDKKEGLR 
QL, VMTFQHFMREE I QDASQLPPAFDL.FEAFAKVICNS FT I CNAE 
MQEVGVGLYPS ISLLNHSCDPNCS I VFNGPHLLLRAVRDIEVGE 
ELT I CYLDMliMTSEERRKQLRDQ YCFECD\ CFRCQTQD KDADML 
TGDEQVWKEVQESI*KKIEEIjKAHWKWEQVI*AMCQAI 3 SSNSERL 
PDINIYQLKVXjDCAMDACINI>GIjLEEAIvFYGTRTMEPYRIFFPG 

shpvrgvqvmkvgki^lhqgmfpqamk^riafdimrvthgreh 
sl i e dlj i*ll e / amrrqhqs i lrersqrb i rrvsllnallrsht 
lcfvscvnlsywkfcsvfv 


6122 


2 


2324 


RFRKMADGGAASQDESSAAAAAAADSRMNNPSETSKPSMESGIX3 

NTGTQTNGLD FQKQP VP vggai staqaqaflghlhqvqlagtsl 
QAAAQSLNVQSKSNEESGDSQQPS qpsqqps vqaai pqtqlmla 
GGQ I TGLTLTPAQQQI*LLQQAQAQAQIiI*AAAVQQHSASQQHSAA 
GATJSASAATPMTQIPLSQPIQIAQDI^I^QLCKKJNI'NLQQFV 
LVHPTTNIjQPA\QFIISQTPQGQQGLLQA\QNL1»TQLPRQSQAN 

llqsqpri\ti*tsqpatptctiaatpiqtlpqsqstpkridtps 
leep \sdlebi^qfaktfkqrriki^ft\c^daglamvklygnd 
fsptti frfeatnlsfknmcklkplijekwlndaenlssdsslss 
psalnspgiegi>srkrkiqitsiea\nirvai>eksflen\qkpts 
eeitmiadqi*nmekgvirvwfcnrrqkekrinppssgg\tsssp 
i kai fps ptslvattpslvtssaattltvspvlplt5aavtnls 
vtgtsdttsljn^atvistappassavtspslspspsasastsea 

SSASETSTTQTTSTPI>SSPLGTSQVMVTASGI.QTA/AQLIiPFKG 
AAQLPANAS LAAMAAAAGLN PSLMAPSQ FAAGGALLSLNPGTLS 
GALS P ALMSNS TLAT I QALASGG S LP ITS LDATGNLVFANAGGA 
PNI VTAPLFLNPQNLSLLTSNP VSLVSAAAASAGNSAPVAS LHA 
TSTSAESIQNSLFTVASASGAASTTTTASKAQ 


6123 


3 


2944 


HIjLHRWFGTDMQMINFTTGEFQLTEACPYLGTHSEESRFGIIjHL 
HLQPLEMKRVGWFTPADYGKVTSLI LIRNNLTVIDMIGVEGFG 
AR RTJ >KVGGRIjPGAGGSI»RFKVPESTLMDCRRQ1,KDSKQI LS I T 
KNFKVENI GPLP I T VS SLKINGYNCQG YGFEVLDCHQFS LD PNT | 
SRDIS IVFTPDFTSSWVIRDLSLVTAADLEFRFTLNVTIjPHHIiL 
PIjCADWPGPSWEESFWRLTVFFVSLSLLGVTLIAFQQAQYILM 

ef14ktrqrqnassssqqnngpi^visphsyksncknfldtygps 
dkgrgknclpvntpqsriqnaakrspatyghsqkkhkcsvyysk 
hktstaaasststtteekqts plgsslpaakedi ctdamrenwi 

S LRYASG INVNIXJKNLTLPKNLIaNKEENTLKNT IVFSNPSSECS 
MKEGIQTCMFPKETD I KTSENTAE FKERELC PLKTS KKXiPENHL 
PRNSPQYHQPDLPEI SRXNNGNNQQVPVKNEVDHCENLKKVDTK 
PSS EKKIHKTSREDMFSEKQDI PFVEQEDPYRKKKLOEKREGNL 
QMLIWSKSRTC^UCNKKRGVAPVSRPPEQSDLKLVCSDFERSEIjS 
SDINVRSWCIQESTREVCKADAEIASSLPAAQREAEGYYQKPEK 
KCVDKFCSDSSSDCGSSSGSVRASRGSWGSWSSTSSSDGDKKPM 
VDAQHFLPAGDS VSQNDFPSEAPISLNLSHN I CNPMTGNSLPQY 
AEPSCPSLPAGPTGVEEDKGLYSPGDLWPTP PVCVTS SLNCTLE 
NGVPCVI QES AP VKNS FI DWSATCEGQFS S AYCPLELND YNAFP 
EENMNYANGFPCPADVQTDFIDHNSQSTWNTP P\NMPAS \ WGNA 
QFPSS SRPYLKSTPKACLPMSGLFGPI \ WAP \QSDVYENCCPIN 
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I amino acid 
residue of 
amino acid 
1 sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D-Aspartic Acid, E= 
Glutamic Acid, P= Phenylalanine, G=Glycine, 
H-Histidine, I=Isoleucine, K=Lysine, 
I>=I*eucine, M=Methionine, N=Asparagine, 
P= Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Tfareonine, v= Valine, 
W=Tryptophan, Y= Tyrosine, X-Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\apo3sible nucleotide insertion) 








PTTEHSD / THMENQA\WCKEYYPGF \NPFRAYMNLDI WTTT\A 
NRNANFPL»SRDSSYCGNV 


6124 


1573 


236 


SDEAIiRLAGERGMGRVQLFE I S LSHGR WYSPGE PLAGTVRVRJj 
GAPLPFRAIRWCIGSCGVSNKANDTAt^VVEEGYFNSSL»SIADK 
GS LPAGEHS FP FQ FKLPATAPTS FEGPFGKI VHQVRAAIKTPRF 
S KDHKCS I* VFY 1 I*S PLNLNS I PDIEQ PNVASATKKFS YKLVKTG 
SWLTAS TDLRGYVVGQALQLHADVENQSGKDTS P WASIJjQKV 
S YKAKRW I HDVRT IAEVEGAGVKAWR RAQWHEQII#VPAL»PQSAI* 
PGCSLIHIDYYLQVSLKAPEATVTLPVFIGNIAV/NPCPSEPPA 
RPGAAS WG PTPGG \ PS AP PQEEAEAKAAAGGPHFLD P VFTjS TKS 
HSQRQPLLATLSSVPGAPEPCPQDGSPASHPIiHPPLCISTGATV 
PYFAEGSGGPVPTTSTLILPPEYSSWGYPYEAPPSYEQSCGGVE 
PSLTPES 


6125 


l 


| 904 


KTCPKLTCAFTVSVPDSCCRVCRGDGELSWEHSDGDIFRQPANR 
EARHSYHRSHYDPPPSRQAGGLSRFPGARSHRGALMDSQQASGT 
IVQIVXimKHKIJGQVCVSNGKTYSHGESWHPNIJIAFGIVECVLC 
TCNVTKQECKKIHCPNRYPCKYPQKIDGKCCKVCPG/KKAKEEL 
PGQSFDNKGYFCXSEETMPVYESVFMEDGETTRKIALETERPPQV 
EVHVWTIRKGILQHPHIEKISKRMFEEJOPHFKIjVTRTTLSQWKI 
FTEGEAQISQMCSSRVCRTELEDLVKVLYLERSEKGHC 


6126 


1224 


i 389 


rllseapcprsrrrfqmnpewgoafvhvavagglcavavftgif 

DS VS VQVG YEHYAEAP VAGIiPAFliAM P FNS IjVNMAYTUjGIjS WI» 
HRGGAmijGPRYI.KDVFAAMALLYGPVQWIJiIiWTQWRRAAVIJ)Q 
WI*TLP I FAWPVAWCLYLDRGWRP \ WLFIjSLECVSIAS YGIALIoH 
PQG FETVALG AHWP AVGQALRT \ HRHYG / S ATPSATYLALGVLS 

CLGFVVLKiCl^HQLiARVnUjFQCLTGHFWSKVCDVI^raFAPIiFIj 
THFNTHPRFHPSGGKTR 


6127 


1335 


463 


VLPRRCLVFVVNTMDSSREPTIiGRLDAAGFWQVWQRFDADEKGY 
I EE KE LDAF FLHMLMKLGTDDTVM KANLH K V KQQ FMTTQDAS KD 
GRIRMKEIAGMFLSEDENFLLLFRRENPLDSSVEFMQ1WRKYDA 
DS SG F IS AAKIj RNFLRDLFLHHKKAJ SEAKLEE YTGTMMK I FDR 
^^u^GRI^DIJ^^)IlARIIALQENFIiLQFKMDACSTEKRKGDFEKIFA 
YYDVS KTGALEGP \ EVDGF VKDMMELVQPS I SGVDLDKFRE I LL 
RHCDVNKDGKI QKSEIALCIX3LKINP 


612S 


2511 | 


843 


TCRMSRRQLERWVWSSQQVQARGRNVRAPRLGKIAMGIjEMSSKD 

spgsldgrawedaqkpqsawcggrktrvyatssrrappsegtrr 
ggaarpektaeegppaapgslrhsgplgphacptalpepqvtsa 
mssqwgieply i kaepaspdspkgsseteteppvalapg\pap 
trc^pghkeeedgegagpgeqgggklvlsslpkriiclvcgdvas 
g yhygvas ceacka ffkrt i qgs ie ys cpasne ce i tkrrrkac 
qacrftkclrvgmlkegvrldrvrggrqkykrrpevd plipfpg p 
fpagpiavaggprktaapvnalvshdiivvepekliyampdpagpd 
ghlpavatxcdifdrei wti swaks i pgfsslslsdqmsvlqs 
vwmevlvlgvaqrs iitlqdeiiafaeylvl»deegarpaglgelg\ 

AALIiQIiVRRDQALRLEREEY VLLKAIAIANSDSVHI EDEPRLWS 
S CEKLLHEALLE YEAGRAGPGGGAERRRAGRIXLTLPLLRQTAG 
KVIiAHFYGVKIjEGKVPMHKLFLEMLEAMMD 


6129 


1764 


771 


ARFARSAHEGKMPKKKTGARKKAENRREREKQIiRASRSTIDIiAK 
HPCNASMECDKCQRRQKNRAFCYFCNSVQKLP I CAQCGKTKCMM 
KSSDCVIKHAGVYSTGLAMVGAI CDFCEAWVCHGRKCLSTHACA 
CPLTDAEC\VECERGVWDHGGRIFSCSFCHNFLCEDDQFEHQAS 
CQ VLEAETFKCVS CNRIGQHS CLRCKACFCDDHTRS KVFKQEKG 
KQPPCPKCGHETQETKDLSMSTRSLKFGRQTGGEEGDGASGYDA 
YWKNLSSDKYGDTSYHDEEEDBYEAEDDEEEEDEGRKDSDTESS 
DLFTNUSTLGRTYASGYAHYEEQEN | 


6130 


3 | 


577 


GRGGTMRE YKWVLGSG \GVGKS ALTVXQFVTCTFIE KYDPTI E 
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Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine , D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine , G=Glycine, 
H=*Histidine, I=Isoleucine, K=I>ysine, 
L=I#eucine, M=Methionine, H=Asparagine, 
P=Proline, Q-Glutamine, R^Arginine, 
S=Serine, T=Threonine, V= Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








DF YRKE I EV \DSS PS VAG I S WTQQGTEQ F VASMRDli Y I KKGQGC 
I LVYSLVNQQS FQ\DI KPMRDQ I IRVKVS EKVPVT \LVGN\S VD 
LESEREVSSSEGRALAEEWGCPFMETSAKSKTMVDELFAEIVRQ 
MNYAAQPDKDDPCCSACNIQ 


6131 


3 


1811 


SS PREKTS DSSKRPSRHG FLFLRL.VGLS PFS YLCVP PSRPVPGS 
PRSLSAMRLLPLAPGRLRRGS PRHLPSCS PAliLLLVLGGCLGVF 
GVAAGTRRPNVVLIJjTDI)QDEVIiGGMTPI>KKTKAIjIGEMGMTFS 
SAYVPSALCCPSRASIIjTGKYPHNHHVVNNTLEGNCSSKSWQKI 
QE PNTFPAI LRSMCGYOTFF\AGKYLNEYGAPDAGGLEHVPLGVI 
SYWYALEKNSKYYl^TLSIKGKARKHGENYSVDYLTDVIANVSIi 
D FIiD YKS NFE PF FMMTATP \APHS PWT AAPQYQKAFQNVFAPRN 
KNFNIHGTNKHWLIRQAKTPMTNSSIQFLDNAFRKRWQTLLSVD 
DLVEKLVKRLEFTGELNNTYI FYTSDNGYHTGQFSIjP I DKRQLY 
EFDI KVPLLVRGPG I KPNQTS KMLVANIDLGPTIIiDI AGYDLNK 
TQMDGMSLL P I LRGASNI.TWRS DVLVE YQGEGRNVTDPTCPSI>S 
PGVSQCFPDCVCFJDAYNNTYACVRTMSALWNI»Q YCE FDDQEVFV 
EVYl^TADPIXJITN'IAKTIDPELliGKMNYPXMriliQSCSGPTCRT 
PGVFDPGYRFDPRLMFSNRGSVRTRRFSKHLL 


6132 


96 


1241 


AAGIiLPPGLVPEDPRRTRNIiLPFGIQGPPFAIjSRPLFSCVESGW 
AWFJUVIEPEFLYDt^LPKGVEPPAEEELSKGGKKKYBPPTSRKD 
PKFEELQKPA\ VLMEW INATLLPEHI WRSLEEDMFDGLI LHHL 
FQRI*AAI*KJ^EAEDIAI.TATSQKHK1.TWLEAVNRS\CSWRSGRP 
SGA/WES I FIJKDDLSTI^LVALAKRFQPDLSLPTNVQVE VITI 
ESTKSGLKSEKI*VEQLTEYSTDKDEPPKDVFDEbFKriAPEKVNA 
VKEAI VNFVNQKLDRLGLSVQNLDTQFADGVI LLLLIGQLEGFF 
LHIiKEFYl*TPNSPAEMLHNVTLALELL/ IGRGPAQLPC/LALK/ 
TT VNKDAKS TLRVL YGL FCKHTQKAHRDRTPHGAPN 


6133 


2 


4256 


FVHGSMADTDLFMECEEEEIiEPWQKISDVIEDSWEDYNSVDKT 
TTVS VS QQ PVSAPVPIAAHAS VAGHLSTSTTVS S S GAONSDSTK 
KTLVTLIANNNAGNPIiVQQGGQPLILTQNPAPGIiGTMVTQPVLR 
P VQVMQNANHVTSSPVASQ PI FITTQGFPVRNVRPVQNAMNQVG 
I VLNVQQGQTVR P I TL VP A PG TQ F VK PT VGVPQ VF S Q MT PVR PG 
STMPVRPTTNTFl'TVIPATLTIRSTVPQSQSQQTKSTPSTSTTP 
TATQPTS LGQLAVQS PGQSNQTTNPKLAPS FP S P PAVS IAS FVT 
VKRPGVTGENSNEVAKLVNTLNTI PSLGQSPGP WVSNNSSAH\ 
GSQRTSGPESSMKVTSS I PVFDLQDGGRKICPRCNAQFRVTEAI* 
RGHMCYCCPEMVEYQKKGKSLDSEPSVPSAAKPPS PEKTAPVAS 
/THPSSTP IPA1.S PP Y/TKVPE PNENVGDAVQTKI.IMI>VDDFYY 
GRDGGKVAQLTNF P KVATS FRCPHCTKRiKNN IRFMNHMKHHVE 
IiDQQNGEVDGHTICQHCYRQFSTPFQLQCHLENVHS PYESTTKC 
KICEWAFESEPLFMHMKDTHKPGEMPYVCQVCQYRSSLYSBVD 
VHFRMIHEDTRHLLCPYO,KVFraGNAFQQKYMRHQKR\NVra\ 
COTOTVQFLFAKDKIEHKLQHHKTFRECPKQLEGLKPGTKVTIRA 
SRGQPRTVPVSSrnJTPPSJU^QEAAPLTSSMDPLPVFLYPPVQRS 
I QKRAVRXMS VMGRQTCLECS FE I PDFPNHFPTYVHCSIjCR YST 
CCSRAYANHMINNHVPRKSPKYLAIiFKNSVSG I KLACTSCTFVT 
S VGDAMAKHLVFN PS HRS S S I LPRGLTWI AHSRHGQTRDRVHDR 
NVKNMYPPPSFPTNKAATVKSAGATPAEPEELLTP3LAPALPSPA 
S TATPPPTPTHPQAI*AIj PPLATEGAECLNVDDQDEGS PVTQEPE 
LASGGGGSGGVGKKEQI^VKKI^VVLFALCOn^QAAEHFRNPQ 
RRIRRWLRRFQASQGEmjEGKYLSFEAEEKliAEWVL,TQREQQLP 
VNEETLFQKATKIGRSIiEGGFKISYEWAVRFMIiRHHLTPHARRA 
VAHTLPKDVAENAGIiFIDFVQRQIHNQDLPLSMIVAIDEISLFL 
DTEVX.SSDDRKENALQTVGTGE PWCDWLAI 1ADGTVL PTLVF Y 
RGQMDQPANMPDSILLEAKESGYSDDEIMELWSTRVWQKHTACQ 
RSKG^VMDCHRTHI^EEVIAKLSASSTLPAVVPAGCSSKIQPL 
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Amino acid segment containing signal peptide 
(A-Alanine, C=Cysteine, D=Aspartic Acid, E- 
Glutamic Acid, F= Phenyl alanine, G=*Glycine, 
H=Histidine, I=Isoleucine, K^Lysine, 
L-Leucine, M=Methionine , N=Asparagine, 
P=Proline, Q=Glut amine , R=Arginine, 
S=Serine, T^Threonine , V=Valine, 
W= Tryptophan , Y=Tyrosine, X» Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








DVC I KRTVKNFIjHKKWKEQAREMAOTACDSDVIiIjQIjVLVWLGE V 
LGVI GDCPELVQRS F1*VASVI»PGPDGNINSPTRNADMQEELIAS 
IiEEQLKLSGEHSESSTPRPRSSPEETIEPBSLHQLFEGESETES 
FYGFEEADLDLMEI 


6134 


2 


425S 


FVHGSMADTDLFKECEEEELEPWQKISDV1EDSVVEDYNSVDKT 
TTVS VSQQPVS APVP lAArfASVAGHLSTSTTVS SSGAQNSDSTK 
KTLVTLI ANNNAGN P LVQQGGQ PL I LTQN P APGLGTMVTQ P VLR 
PVQVMQNANHVTSS PVAS QPI F ITTQG FPVRNVRPVQNAMNQVG 
IVLNVQQGQTVRP I TLVPAPGTOFVKPTVGVPQVFSQMTPVRPG 
STMPVRPTTNTFTTVI PATLTIRSTVPQSQSQQTKSTPSTSTTP 
TATQPTSIX^IiAVQSPGQSNQTTNPKLAPSFPSPPAVSIASPVT 
VKRPGVTGENSNEVAKLVNTLNTI PSLGQSPGPVWSNNSSAH\ 
GSQRTSGPESSMKVTSS I PVFDLQDGGRKI CPRCNAQFRVTEAL 
RGHMC YCCPEMVE YQKKGKS LDSE PS VPSAAKP PS PEKTAP VAS 
/THPS STPI PALS PPY/TKVPEPNENVGDAVQTKLIMLVDDFYY 
GRDGGKVAQLTNFPKVATSFRCPHOTKRIiKNNIRFMNHMIGlHVE 
I^MNG5VIX3HTICQHCYRQFSTPFQLQCHLENVHSPYESTTKC 
K1CEWAFESEPLFLQHMKDTHKPGEMPYVCQVCQYRSSLYSEVD 
\^FRMIHEDTRHLLCPYCLKVFKNGNAFQQHYMRHQKR\NVYH\ 
CNKCRVQFIiFAKDKIEHKLQHHKTFRKPKQLEGIjKPGTKVTIRA 
SRGQPRTVPVSSNDTPPSAliQEAAPLTSSMDPLPVFLYPPVQRS 
IQKRAVRKMSVMGRQTCLECS FEIPDFPNHFPTYVHCSLCRYST 
CCS RAYANHM XNNHVPRKS PKYLALFKNSVSGI KLACTSCTFVT 
SVGDAMAKHLVFNPSHRSSSILPRGLTWIAHSRHGQTRDRVRDR ! 
NVKNMYP PPS FPTNKAATVKSAGATPAE PEELLTPLAPALPS PA 
STATPPPTPTHPQALAIiPPIiATEGAECLNVDDQDEGSPVTQEPE 
IiASGGGGSGGVGKKEQLSVKKI^VVLFALCCNTEQAAEHFRNPQ 
RRIPJlWUUiFQASG/3Em£GKY15FEAEEKLAEWVLTQREQQIiP 
VNEETLFQKATK I GRSLEGG FKI S YEWAVRFMLRHHLTPHARRA 
VAHTLPKDVAENAGLFIDFVQRQIHNQDLPLSM IVAIDEI SLFL 
DTEVLS SDDRKENAIjQTVGTGEPWODVVLAI LAIXJTV^ 
RGQMDQPANMPDS ILLEAKESGYSDDE IMELWSTRVWQKHTACQ 
RSKGT^VMDCHRTHIjSEEVIAMLSASSTLPAVVPAGCSSKIQPIj 
DVCIKRTVKNFI^KKWKEQAREMADTACDSDVI^ 
LGVIGDCPELVQRSFLVASVLPGPDGNINS PTRNADMQEELI AS 
LEEQLKLSGEHSESSTPRPRSSPEETIEPESLHQLFEGESETES 
FYGFEEADLDLMEI 


€135 


2 

» 


4256 


FVHGSMADTDLFMECEEEELE PWQKISDVI EDSWEDYNSVDKT 
TTVS VSQQPVSAPVP I AAHAS VAGHLSTSTTVS S SGAQNSDSTK 
KTLVTLIANNNAGNPLVQQGGQPLI LTQNPAPGLGTMVTQPVLR 
PVQVMQNANHVTSSPVASQPI FI TTQGFPVRNVRP VQNAMNQVG 
I VLNVQGGQTVRP I TLVPAPGTQFVKPTVGVPQVFSQMTPVRPG 
STWPVRPTTNTFTTVIPATLTIRSTVPQSQSQQTKSTPSTSTTP 
TATQPTSLGQLAVQS PGQSNQTTNPKLAPS FPS PPAVS IASPVT 
VKRPGVTGENSNEVAKLVNTLin , IPSLGQSPGPVVVSNNSSAH\ 
GS QRTSG PES S MKVTS S I PVFDLQDGGRKI CP RCNAQFRVTEAL 
RGHMCYCCPEMVEYQKKGKSLDSEPSVPSAAKP PS PEKTAP VAS 
/THPSSTP I PALS PP Y/TKVPEPNENVGDAVQTKLIMLVDDFYY 
GRDGGKVAQLTNFPKVATSFRCPHCTKRLKNNIRFMNHMKHHVE 
LDO^NGEVDGHTICQHC^QFSTPFQUSCHLENVHSPYESTTKC 
KICEWAFESEPLFLQHMKDTHKPGEMPYVCQVCQYRSSLYSBVD 
VHFRMIHEDTRHIJiCPYCLKVFKNGNAFQQHYMRHQKRNNVYHX 
CNKC^VQFLFAKDKIEHKIiQHHKTFRKPKQIiEGLKPGTKVTIRA 
SRGQPRTVpVS SNDTPPS ALQEAAPLTSSMDPLFVFLYPP VQRS 
IQKRAVRKMSVMGRQTCLECS FEI PDFPNHFPTYVHCSLCRYST 
CCSRAYANHMINNHVPRKSPKYLALFKNSVSG I KLACTSCTFVT j 
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Amino acid segment containing signal peptide 
(A= Alanine, C=Cysteine, D^Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H»Histidine, I=Isoleucine, K=Lysine, 
L^Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








SVGDAMAKHLVFNPSHRSSS ILPRGLTWIAKSRHGQTRDRVHDR 
NVKNMYPP PSFPTNKAATVKSAGATPAEPEELLTPLAPALPS PA 
STATP PPT PTHPQALA1J PIATEGAECLNVDDQDEGS PVTQE PE 
LASGGGGS GG VG KKEQLS V KKLRWL F ALCCNTEQ AAEH FRN PQ 
RRIRRWLRRFQASQGENLEGKYLS FBAEEKLAEWVLTQREQQLP 
VNEETLFQKATKIGRSLEGGFKIS YEWAVRFMLRHHLTPHARRA 
VAHTLPKD VAENAGLF I DFVQRQ I HNQDLPLSMIVAI DEISLFL 
DTEVLSSDDRKENALQTVGTGE PWCDWIAI LADGTVLPTLVFY 
RGQMDQPANMPDS ILLEAKESGYSDDEIMELWSTRVWQKHTACQ 
RSKGMLVMDCHRTHI*SEEVLAM1*SASSTLPAVVPAGCSSKIQPL 
DVCI KRT VKNFIiHKKWKEQAREMADTACDSDVLLQIiVLVWLGEV 
LGVI GDC P ELVQRS FLVAS VLPG PDGNINS PTRNADMQEE L IAS 
LEEQLKLSGEHSESSTPRPRSS PEETIE PESLHQLFEGES ETES 
FYGFEEADLDLME I 


6136 


1704 


539 


FG VRMALEGMS KRKRKRS VQEG ENPDDGVRGS P PED YRLGQVAS 
SLFRGEHHSRGGTCRI»ASLFSSI>EPQIQPVYVPVPK\ESAIASA 
DLEEEIHQKQGQKRKNSQPGVKVADRKIIiDDTEDTVVSQRXKlQ 
INQEEERLKNERTVFVGNLPVTCN KKKLKSFFXEYGQIESVRFR 
SIiIPAEGTLSKXLAAIKRKIHPDQKNINAYVVFKEESAATQALK 
RNGAQIADGFRIRVDIASETSSRDKRSVFVGNLPYKVEESAIEK 
HFLDCGSIMAVRIVRDKMTGIGKGFGYVLFBNTDSVHLAIjKIjNN 
SEL^RKLRVMRSVNKEKFKC^NSNPRIJaJVSKPKQGLNFTS KT 
AEGHPKSLFIGEKAVIiLKTKKKGQKKSGRPKKQRKQK 


6137 


141 


2656 


ralrkrrcgpgrrgaix5sgpgpqrrpgrvpeerjpapprerkhpg 
mwnmli vamcia\llglpgkaqelqghvs\ i ilageqlgdlakk 
ylwqg \lfqlyldeagrghs fs fhgaaltapkqgqelmakax.es 
lscpkdmapshcaehiffiq flqls q yrqlktaedyqalnkd 1 eaq 
LQHAGLREAGG 1 FYFSVP pfayedi arninsscrpgpgawlrw 
LEKP FGHDHFS AQQLATELGTFFQEEEMYRVDHYLG KQAVAQ I L 
PFRDQNRKALDGLWNRHHVERVEI 1 MKETVDAEGRTSFYEEYGV 
I RD VLQNHLTEVLTLVAMELPHNVS S AFAVLJ^KLQVFQALRGL 
QRGSAWGQYQS YSEQVRRELQKPDSFHSLTPTFAGVLVH idnl 
RWEGVPFILKSGKALJDERVGYARILFKNQACCVQSEKHWAAAQS 
QCLPRQLVFHIGHGDLGSPAVLVSRNLFRPSLPSSWKEMEGPPG 
LRLFGS PLSDYYAYS PVRERDAHSVLLSHI FHGRKNFFITTENL 
LASK^Fi^TPLLESIAHKAPRLYPGGAEJ^GRLLDFEFSSGRLFFS 
QQQPEQLVPGPGPGPMPSDFQVLRAKYRESSLVSAWSEELISKI* 
ANDIKATAVRAVRRFGQFHLALSGGSS PVALFQQLATAHYG FPN 
AHTHLWLVD ERCVP L3DP ES NFQGLQAHLLQHVRI PYYNIH\AM 
PVHLQQRLCAE EDQGAHI YARE I S ALGANSS FDLVLLGMGADGH 
TASLFPQSPTGLDGEQLVVLTTSPSQPHRRMSLSLPLINRAKKV 
AVLVMGRMKRE I TTLVSR VGHE PKKWP I SGVLPHSGQLVWYMDY 
DAFLG 


6138 


45B7 


934 


EFSKLTDRWQNAVQGVRQRKGD VlXjijVKywyur 1 1£> vc,NJL»r Kr L* 
TDTSHLLSAVKGQERFSLYQTRSLIHELKNKEIHFQRRRTTCAL 
TLEAGEKLLLTTDLKTKESVGRRISQLQDSWKDMEPQLAEMIKQ 
FQSTVETWDGCEKKIKFILXSPJLQVL 

EL I KKT»RQSLASWTQNLKELC^niKADLTRH^ 1 E 
HLHRQWEDLCLRVA 1 R KQEIBDRLHTWVVFNEKNKELCAWL.VQM 
ENKVLQTADI S IEEMI EKLQKDCMEEINLFSENKLQLKQMGDQL 
I KASNKS RAAE I DDKLNKINDRWQHLFDVI GSRVKKLKET FAFI 
QQLDKNMSNLRTWLAR I ESELS KPWYDVCDDQEIQKRLAEQOD 
LQRD IEQHS AG VE5 VFNI CDVLLHDS DACANETECDS I QQTTRS 
LDRRWRNI CAMSMERRMKIEF/TVJRLWQKFLDDYSRFEDWLKSAB 
RTAACPNS S EVL YTS AKE ELKR FEA FQRQ IHERLTQLEL I N KQ Y 
PJUARENRTDTASRLKQMVHEGNQRWDNIXJKRVTAVLRRLRHFT 
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Predicted | Predicted end I Amino acid segment containing sxgnal peptide 

beginning I nucleotide I (A= Alanine, C=Cysteine, D=Aspartic Acid, E= 

nucleotide location Glutamic Acid, ^Phenylalanine, G=Glycirie, 

location corresponding I H=Histidine, I^Isoleucine, K=Lysine, 

corresponding to first L=Leucine, M=Methionine, N=Asparagine, 

to first amino acid P=Proline, Q=Glutamine, R=Arginine, 

amino acid residue of S^Serine, T=Threonine, V=Valine, 

residue of amino acid W=Tryptophan, Y=:Tyrosine, X=Unknown # *«Stop 

amino acid sequence | Codon, /=possible nucleotide deletion, 

\=possible nucleoti de insertion) 

NQREBFEGTRESIXiVWLTEMDIjQLTNVEHFSESDADDKMRQLNG 
PQQE I TLNTNKI DQli I VFGEQLIQKSEP \ LDAVL I E DELEEI*HR 
YCQEVFG RVSRFHRRIiTSCT PGLEDEKEASENETDM ED PRE I QT 
DSWRKRGESEEPSSPQSLCHLVAPGHERSGCETPVSVDS\IPLE 
WDHTGRRGGPSSSH\EEDEEAQYY\SALSGKSISDGHSWHVPDS 
PSCPEHHYKQMEGDRNVPPVPPASSTPYKPPYGKLLIiPPGTDGG 

KEG PRVLNGNPQQEDGGLAG I TEQQSGAFDR WEM I QAQE1»\HNX 
LKI KQNLQQLNSDI SAITTWIiKKTEAELEMLKMAKPPSDIOEI E 

LR VKRIjQE I LKAFDT YKAIi WS VNVSS KE FIjQTE S PE S TELQS R 
LRQLiS IJjWEAAQGAVDS WRGGLRQSLMQCQDFHQLS QNIiIikWIiA 

saknrrqkahvtdpkadprallecrrelmqlekelverqpqvdm 
lqeisnsi*likghgedcieaeekvhvi\eicklkqlreqvsqdlm 
alqgtqnpasplpsfdevdsgdqppatsvpaprakqfravrtte 

GEEETESRVPGSTRPQRSFL»SRVVRAALPIjQI*IaLLI»LIjI*IiACLL 
PSSEEDYSCTQANNF\ARSFYPMLRYTNGPPPT 



- 6139 52 1131 [ LGDWVWSRTCGVLETPTSVLRRARARGPCPTDSKWAIiPRl^REGE 

TERRPWEASSWKTL/IiAGWIGGAASVIVGHPLDTVKTRLQAGVG 
YGNTI*S C I R WYRRES MFGF FKGMS FPLAS IAVYNS WFGVFSN 
TQRFIjS QHRCGEPEAS pprtlsdlllasmvag WSVGLGGPVDL 

ikiri^mo/tppvsgrqprfevqgsgscgXepayqgpvhcittiv 

RNEGIAGI#YRGASAMLIjRDVPG YCLYFI P YVFLSEW I TP EACTG 
PSPCAVWIAGGMAGAI S WGTAT PMD WKSRLQ ADGVYX.N KYKGV 
LDCI SQS YQKEGLKVFFRG I TVNAVRGFPMS AAMFLGYEIiSLQA 

IRGDHAVTSP 

"6140 694 136 1 RPELELWRLRSRSWRPI>GVPRRCHRRNWKEPVRAyPt^VTVWAP 

RCQRP /QPPAPEPSSPNAAVPEAI PTPRAAASAALELPLGPAPV 
SVAPQAEAEARSTPGPAGS RLGPETFRQRFRQFR YQDAAGPREA 
FRQLREL/SPRQWLRPDl\RTKEQ\lVEMLVQEQLriAILPEAAR 

ARRIRRRTDVRITG 



6L41 2 ~9B4 1 AQVG PRSRP CKM PLKLRG KKKAKS KETAG LVEG £ FTGAGGG S L*S 

ASRAPARRLVFHAQLAHGSATGRVEGFSSIQEI>YAQIAGAFEIS 
PS E I LYCTLNTPKIDMERLLGGQLGLEDFI FAHVKGIEKEVNVY 
KSEDSLGIjTITDNGVGYAF I KRIKDGGVIDSVKTI CVGDH I ESI 
NGEN I VGWRHYDVAKKLKE LKKEELFTMKLI EP KKAFE I EI*RS K 
AGKSSGEKIGCGRATLRI,RSKGPATVEEMPSETKAK\AIEKIDD 
\nLELYMGIRDIDlATTMFEAGKDKVNPDEFAVALDETLGDFAFP 

DEFVFDVWGVIGDAKRRGL» ____ 



6142 



6143 



116 



2802 



602 



L I VEWVNQEin>EKDEKEQV'ANKGEPLALPI*NVS E YCVPRGNRRR 
FRVRQP I I^YRWDIMHRIjGEPQARMREENMERIGEEVRQIjMEKL 

rekqlshslravstdpphhphhdefc\lmp 

"27^ 1 frmr i flhcp wnqqmwki wnlitetslesckahiis i qklxlker \q 

\qlpvfkhrds ivetlkrhrvwvaget\gsgkstqvphfi»led 
i^lneweaskcnivctoprrisavslanrvcdelgcengpggrw 

SUTGYQIRMESRACESTRLLYCriGVLLRKLQEDGLl^SNVS/HM 
FIVDEV\HER\SVQSDFLI»III*KElLQIO?SDLHLILMSATVDSE 
KFSTYFTHCPILRISGRSYPVEVFHLEDIIEETGFVLEKDSEYC 
QKFLEEEEEVTINVTSKAGGIKKYQEYI PVQTGAHADLNPFYQK 
YSSRTQHAI LYMNPHKINLDL I LELLAYLDKS PQFRNI EGAVXi I 
FLPGIAHIQQLYDMiSNDRRFYSERYKVI ALHS ILSTQDQAAAF 
TI^PPGVRKIVIiATNIAETGITIPDWFVirTORTKENKYHESS 
QMSSLVETFVSKASALQRC^RAGRWDGFCF^YTRERFEGFMD 
YSVPEILRVPLEELCLHIMKCNLGSPEDFLSKALDPPQIiQVISN 
AMNLIJlKIGACEI^EPKLTPLGQHLAALPVimCIGKMLIFGAIF 

GCLDPVATLiAAVMTEKSPFTT P IGRKDEAD1AKSALAMADSDHI* 
TTVKTAYI^WKKAROEGGYRSEITYCRRNFLNRTSLIiTLEDVKQE 
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BNSDOCID: <WO. 
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SEQ 1 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
aicu.no acia 
sequence 


Predicted end 
nucleotide 

1 ah i on 

corresponding 
to first 
amino acid 
residue of 
amino acid 

5> cuuc * 


Amino acid segment containing signal peptide 
(A=Alanine, C-Cysteine, D=Aspartic Acid, E= 
Glutamic Acid. F= Phenyl alanine, G=Glycine, 
H=Histidine, I=sIsoleucine, K^Lysine, 
L=L»eucine, M=Methionine <r N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=»Valine, 
W-Tryptophan, Y=Tyrosine, X -Unknown, *=Stop 
Codon, /=*possible nucleotide deletion, 
\=possible nucleotide insertion) 








LIKLVKAAGFSSSTTSTSWEGNRASQTLS fqeiallkavlvagl 
YDNVGKI I YTKSVDVTEKIiAC I VETAQGKAQVHPSS VNRDLQTH 
GWLLYQEKI R YARVYLRETTL I TPFPVLLFGGDIEVQHRERI>I>S 
IDGW I Y FQ AP\HCI AVI FKQIjRVLI DS VLRKKLENP KMSLiENDKI 
LQIITEIilKTENN 


6144 


1289 


S68 


SGPGSMSGQRWVKVVMLGKEYVGKTSIjVERYVHDRFLVGPYQN 
VSASGGARHGGRGSGGPVTCTYGPDIjFPLVAVTIGAAFVAKVMS 
VGDRTVTLGIWDTAGSERYEAMSRIYYRGAKAAIVCYDLTDSSS 
FERAKFWVKELRSLEEGCQIYIiCGTKSDLLEEDRRRRRVDFHDV 
ODYADN I KAO LFETSS KTGO SVDELFQ KVAED YVS VAAFQ VMTE 
DKGVDLGQKPNPYFYSCCHH 


6145 


1109 j 


196 


GGMDI^ELERDNTGRCRIiSSPVPAVCRKBPCVLGVDEAGRGPVL 
G PMVYA I C YCP LPRLADLEALKVADSKTLLES ERERX.FAKMEDT 

DFVGWALDVLS PNL I STSMIjGRVKYNLNSLSHDTATGL I Q YALD 
QGVNVTQVFVDTVGMPETYQARIX3QSFPGIEVTVTCAKADAIiYPV 

\VS AAS I CZAKVARDO^VKKWQFVEKLQDIiDTDYG\SGYPNDPQD 
/TKAWLKEHVEPVF\GFP\QFVRF\SWRTAQTI\LEKEAEDVIR 
EDSAS ENQEGLRKI TS YFI*NEGSQARPRSSHR YFLERGLESTTS 
L 


£146 


428 


781 


LKKKGKEKAEAQQVEALPGPSLDQWHRSAGEEEDGPVI,TDEQKS 
R / YPGHEAHDQGG\ WDARQS IIRKWDPETGRTRLIKGDG EVLE 
EIVTKERHRE INKQATRGDCLiAFQMRAGLLP 


6147 


; 1 


2304 


GTRQIjPPPSPGSGPGDSPEGPEGEAPERRRKAHGMLKIiYYGIjSE 
GEAAGRPAGPDPLDPTDI^GAHFDPEVYLDKLRRECPLAQIjMDS 
ETDMVRQI RALDSDMQTIjVYENYNKF I S ATDTI RKMKNDFRKME 
DEMDRLATNMAVITDFS ARI SATLQDRHERI TKIjAGVHALIiRKL 
QFLFEDPSRIiTKCVELGAYGOAVRYQGRAOAVIiQQYQHLPSFRA 
I0DDCG VITARLAQQLRQRFREGGSGAPEQAECVTEIjIjIiAIjG e pa 
EELCEEFLAHARGRLEKEIjRNLEAELiGPSPPAPDVLEFTDHGXS 
SG FVGGLCQ VAAA YQEJL FAAQGPAGAE KLAAFARQLG S RY FALV 

errlaqeqgggdnsixvpju>drfhrrlrapgallaaagi^ 
eivervarefo^hhi^lraafix3ci,tdwqaiju^ 
glaellanvass ilshikasiaavhlftakevsfsnkpyfrgef 
csqgvregli vgfvhsmcqtaqsfcds pgbkggatppallllls 
rlciidyetati s yiltltdeqfbvqdqfpvtpvs ti>cae ar e t a 
rrlithyvkvqglvisqmijlksvetrdwlsti^prnvravmkrv 

VEDTTAI DVOVLPRLAGVALTQAGGTVP SRGAGAAEDHWQS LPG 
GGDMC I WA^HGAS S VARAS VREPQGNKS PRMNTKRAGECI*C PRS 
CSFSAQDYDIFAPILPVEKQRIiRVTQEVRAGLVIiVLKIRPQTNS 

CILPLPHSTGS INSDHVPTK 


6148 


j 3056" 


353 


VPAVGGTFADGAKGEAEKFHY I YS COLD INVQLXIGSLEGKREQ 
KSYKAVXiEDPMLKFSGLYQETCSDLYVTCQVFAEGKPIiALPVRT 
SYKAFSTRWNWNEWI^PVKYPDLPRNAQVAIjTIWDVYGPGKAV 
PVGGTTVSLFGKYGMFRQGmDLKVWPNCRSQMDQKPTKTPGRT 
SSTI^ETOMSRIiAKLTKAHRQGHMV^CVDWLDRLTFREI EMINES 
VKRSSNFMYTJ^IGGFRCVKCDDKEYGIVYYEKDGDESSPILTSFE 
LVKVPD PQMSLENLVESKHHNLPRSLRSGPSDHDLKPYPSPRDQ 
LKNIVSYPPSKPPTYEEQDLVWEFRYYXTNQDKAL.TKILTSVIW 
DLPQGAKQAIJU.IX3KWKPMDVEDSLELLSSHYTNPTVRRYAVAR 
LRQADDEDLLMYLLQLVOALKYENFDDIKNGLEPTKKDSQSSVS 
ENVSNSGINSAEIDSSQIIT/SAPFPSVSSPPP\ASKTKEVPDG 
ENLEQDLCTFL»ISRASKNSTLANYLYWYVIVECEX)QDTO^RDPK 

THEMYLNVMRRFSOJ^LKG^KSVRVMRSIJjJAO^^ 
KAVQRESGNRKK3CNERLQALLGDNEKMMLS DVELI PliPLE PQ VK 
IRGII PETATLFKSALMPAQLFFKTEDGGKYPVI FKHGDDLRQD 
QLILQI ISLMDKLLRXENIiDLiCLTPYKVLATSTKHGFMQFIQSV 
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RNSnOCID: <WO 
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| SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corre sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A^Alanine, C~Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Hxstidme, I=Isoleucine, K=Lysxne f 
Ii=Leucine, M=Methionine, N=Asparagine , 
PssProline, Q=Glutamine, RsArginine, 
S=Serine, T= Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








PVAEVLDTEGS IQNFFRKYAPS ENGPNG ISAEVMDTYVKSCAGY 
C^ITYII^VGDRHLDNLL.LTKTGKLFHIDFGYILGRDPKPLPPP 
M KLNKEMVEGMGGTQ S EQ YQE FRKQCYTAFLHIjRR YS NL I LNLF 
SLMVDANI PDIALEPDKTVKKVQDKFRLDLSDEEAVHYMQSLID ; 
ESVHALFAAWEQIHKFAQYWRK 


6149 


l 


1413 


RVDPRVRENGTANP I KNGKTSPAS KDQRTG KKTSVQGQVQ KGND 
ESESDFESDPPSPKSSEEEEQDDEEVIiQGEQGDFNDDDTEPENL 
GHRPIiLMDS EDEEEEE KHSSDSDYEQAKAKYSDMS S VYRDRS G S 
GPTQDIiNTI IiliTSAQliSS DVA VETPKQE FDVFGAVPFFAVRAQQ 
PO^EKNEKNLPQHRFPAAGLEQEEFDVFTKAPFSKKVNVQECHA 
VGPEAHTIPGYPKSVDVFGSTPFQPFLTSTSKSESNEDIiFGLVP 
FDE ITGSQQQKVKQRSLQ KLSSRQRRTKQDMS KSNG KRHHGTP T 
STKKTLKPTYRTPERARRHKKVGRRDSQSSNEFLTI sds keni s 
VAI*TDGKDRGNVLQPEESLDDPFGAKPFHSPD\LSWHPP\HQGL 
S\DIRADHNT\VXiPGR\ prqnslhgsfhsadvlkmdd FGAVP/F 
LTELWQS ITPHQSQQSQPV\ELDPFGAAPFPSKQ 


6150 


372 


37 


MSNI KKY 1 1 DYDWKAS I E I E IDHDVMTEEKLHQINNFWSDSE YR 
I^KHGSVI^VLIMLAQHAI^IAISSDI^AYGWCEFDWNDGNG 
QEGWPPMDGSEGIRITDIDTSGI f 


6151 


1555 


521 


DSNQQSVSGTAASTLLHS FKATI YYQGTGHVQQF YGVTS PYSQT 
TPP I VQSYAQPSLQYIQGQQI FTAHPCX5VWQPAAAVT1-I VAPG 
QPQPIiQPSEMVVTl^LDIiPPPSPPKPKTIVLPPNWKTARDPEG 
KIYYYHVITRQTQWDPPTWESPGDDASIiEHEAEMDLGTPTYDEN 
PMK\ASKKPKTAEADTSSELAKKSKEVFRKEMSQFXVQCLNPYR 
KPDCKVG\RITTTEDFKHIJU^KI*THGVMNlCBl4iCYCKNPE\DLEC 
NENVKHKTKEY I KKYMQKFGAVYKPKEDTEFRVTVGFGWEDGWS 
GKTDSRERKSCGPFCSTPVSTVLLMIHHPGEFNPADVN 


6152 


1366 


648 


NRTWSTPSTWMGVAIjPPLCSTGPWPVTRQITARTTCGAVPAKCP 
PWC/DVHEPRCQPPDCHGKGTCVTOHCQCTGHFWRGPGCDELiDC 

gpsncsqhglctetgcrcdagwtgsncseecplgwhgpgcqrpc 
kcehhcpcdpktgncsvsrvkqci^ppeatlragelsfftrtaw 
italtlalaflll i staanls lllsrae rnrrlhgd yayhplqem 
ngeplaaekeqpggahnpfkd 


6153 


2 


3368 


GRVGARSPGRAYAltliLLIil CFNVGSGIiHLQVI^STRNENKLLPKH 
P HX>VRQ KRAW I TAP VAL LEG EDLS KKN P IAKIHSDLAEERGLKI 
TYKYTGKG I TE P P FG I FVFN KDTGEXiNVTS IIjDREETPF FI»LTG 
YAl^ARGNNVEKPLELRIK\nJDINDNEPVFTQDVFVGSVEEl*SA 

AHTLVMKINATDADEPNTLNS kis yrivslepayppvfylnkdt 
GEI YTTS VTLDREEHSSYTLTVEARDGNGEVTDKPVKQAQVQ IR 
I LDVNDNI PWENKVLEGMVE ENQVNVE VTRIKVFDADE IGSDN 
WLANP^FASGNEGGYFHIETDAQTNEGIVTLIKEVDYEEMKNLD 
FS VI VAKKAAFHKS I RSKYKPTP I P I KVKVKNVKEGIHFKS SVI 
S IYVSESMDRSSKGQ1 IGNFQAFDEDTGLPAHARYVKLEDRDNW 
I SVDSVTSE IKIiAKLPDFESRYVQNGTYTVKIVAI SEDYPRKTI 
TGTVTjINVED XNDNCPTLi I E P VQT I CHJDAr»Y VrJ V 1 Ac. u JjJJIjH V N 

sgpfsfsvidkppgmaekwkiarqe^tsvllqqsekklgrse iq 
fi*isdnqgfscpbkqvi,tltvcevlhgs\gcreaqhdsyvglgp 
aaiai^ii^fllia^vpllllmchcgkgakgftpipgtiemlhp 
wnnegappedkvvpsfi,pvdotgslvgp^gvggmakeatmkgss 
sasivkgqhemsemdgrweehrs llsgratqftgatgai \mtte 
ttitafatgasrdvagaqaaavalneeflknyftdkaasyteed 
enhtakdcllvysqeeteslnas igccsfiegelddrflddlgi* 
KFKTIiAEVCLGQKI D I N KE I eqrqkpatets mntashs lceqtm 
vnsent^ssgssfpvpkslqeanaekvtqeivtersvssrqaqk 
vatplpdpmaspjjviatf/rsyvtgstmppttviljgpsqpqsliv 
tervyapastlvdqpyanegtvvvterviqphgggsnplegtqh 
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SEQ 
ID 

NO: 


Predicted 

beginning 

nucleotide 

location 

cor responding 

to first 

amino acid 

residue of 

amino acid 

sequence 


Predicted end 
nucleotide 
location 
corre sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cy3teine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H^Histidine, I=Isoleucine, K=Lysine, 
L= Leucine, M=Me thionine , N=Asparagine , 
P=Proline, Q=Glu t amine , R=Arginine, 
S=Serine, T=Threonine, V= Valine , 
W= Tryptophan, Y=Tyrosine, X=Unknown, +=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








LQDVTYVMVRER2SFIJtf>SSGVQPTl^^ 

APASTIjQSSYQI PTENSMTARNTTVSGAGVPGPLPDFGLE2SGH 
SNSTITTSSTRVTKHSTVQHSYS 


6154 


3660 


2146 


KKKT KMKNTLQ KTVNFGAW P KPT I SD KSHLLQM VS KLDLTDAKN 
SDTAHI KS 1EI TSILNGLQASESSAEDSEQEDERGAQDMDNNGK 
EESKJDHLTNNRNDLISKEBQNSSSIJjEEKKVHADLVISKPVSK 
SPERLRKDIEVI^EDTDYEEDETVTKXRKDVKKDTTDKSSKPQIK 
RGKR R YCNTEECLKTGS PGKKEEKAKNKESLCMENSSNSS SDED 
EEETIU^KmPTKKY^LEEKRKSLRTTGFYSGFSEVAEKRIKLL 
NNSDERLQNS RAKDRKDVWS S IQGQW PKKTLKELFSDSDTEAAA 
SPPHPAPEEGVAEESLQTVAEEESCSPSVELEKPPPVNVDSKPI 
EEKTVEVNDRKABFPSSGSNFSA* I PLPYLHLNRLHQSL * QKGS 
RQQS S VTVS E PIiAPNQEEVRS I KS ETDSTI E VDS VAGELQDLQS 
ERE* 3UASRF*CQCELEQ* *SARTRTS* BCSLYRSEKSERCSGRRK 
F I KKAEKKP * SNSGKQQ KEG K 


6155 


869 


121 


HLIiPELRGKS WITWKYVFYLGVLAGTFFFADS S VQKEDPAP YLV 
YLKSHFNPC^GVLIKPSWI^PAHCYLP^KV^ 
TEQT1NPIQ I VRY WNYSHSAPQDDLMI* IKLAKPAMIiNPKVQALN 
P\PTTIJVRPGTVCLI^GLDWSQENSGRHPDLRQNLEAPVMSDRE 
CQKTEOGKSHRNSLCVKFVKVFSR I FGEVAVATVT CKDKIiQGIE \ 
VGHFMGGDVG I YTNVYKYVS W I ENTAKDK 


6156 


5725 

• 


3984 

• 

* 


GTST VTI^TKKHFS IILNIiLGMLLKKDNQirrRKIi 
VMKKSETYAPLFCLPSFHKFCKGLI^DTLVEDVNICLQACSSLH 
ALS S S LPDDLLQRCVDVCRVQL VHRGTCI RQAFGKLLKS I PLGV j 
FI^NNNHTEIQEISIAIiRSHMSKAPSlJTFHPQDFSD/VISFILY j 
GNSHRTGKDKWLERLFYSCQRIJDKRDQSTIPRNLLKTDAVLWQW i 
AXWEAAQFTVLSKIJiTPI^RAQOTFQTIEGIIRSLAGHTIaNPDQ • 
DVSQWTTAIJKDEGHGNNQIjRLVLIjIjQ YTjENIiEKI^ YKAYEG CAN j 
ALTS P PKVIRTFL YTNRQTCQD WLTR I RLS IMRVGLLAGQ PAVT 
VRHGFDLLTEMXTTSLSQGNELEVS IMMWEALCELHCPEA1QG 
IAVWSSS XVGKHLLWINSVAQQAEGRFEKASVEYQEHLCAMTGV 
DCGISSFDKSVLTLASAGCKSASLKHCLiNGESRKSVLSKPTDSS 
PEVI NYLGNKACE CYI STADWAAVQEWQNAIHDLKKSTS S TS LN 
LKADFNYIKSLSSFESGKFVECTEQLELLPGENINI*LAGGSKEK 

IDMKKLLRNM 


6157 


946 


329 


MAKRGPSYGLSREVQEK1EQKYDADLENKLVDWI I LQCAEO I EH 
PPPGRAHFQKWLMDGTVLCICjINSLYPPGOEPI PKI SESKMAFK 
O^QISQFLKAAETYGVRTTDIFQTVDLWEoKDMAAVQRTLMAL 
GSVAVTKDDGCYRGEPSWFHRKAQQNRRGFSEEQLRQGQNVI gl 

qmgsnkgasqagmtgygmprqim* DAASCP 


6158 


441 


1482 


lgslivlslhckvifssqsleramkekavdlvpilaqnpglaqn 

P ILEGKDHNQNTGVDPIIDHVQDRKTD/SRSKS PHKKPJSKSRER 

RKSRSRSHSRDKRKDTREKI KEKERVKEKDREKEREREKEREKE 
v tmr* irKnrr\x>TMrx? » e» vnD pirriV P trro? FIT? RP RTTRHRKDROKEICE1CE 

QDKEKEREKDRS KE IDEKRKKDKKSRTPPRSYNASRRSRSSSRE 
RRRRRSRSS SRS PRTSKTI KRKS SRS PS PRSRNKKDKKRE KERD 
HI SERRERERS TSMRKSSNDRDGKEKLE KNSTSLKE KEHNKEPD 
S S VS KE VDDKDAPRTEENKI QHNGNCQLNEENLSTKTEAV 


6159 


53 • 


84 


AVIAPLH I S LGDRARP YLKNTEKSSTTCS RRRNQS FPPVMSLTH 
RLHLCKYWGCAVSNVCRFWEGRPLPIiMIWPYTLPVStjPVGSCV 
I ITGTP ILTFVKDPQLEVNFYTGMDBDSDI AFQFRLHFGHPAIM 
NSCVFGIWRYEEKCYYLPFEIX5ICPFBLCIYVRHKEYKVMVNG0R 
I YNFAHRFPPAS VKMLQVFRD I S LTRVL I SD*GRC VRI TAVQE F 
DVSVSCDCTTAYQPG 


6160 


1626 


1790 


AGAKFFP* F * KVADAQPTESEKE I YNQVNWLKDAEG I LEDLQS 
YRGAGHEIREAIQHPADEKLQEKAWGAWPLVGKLKKFYEFSQR 
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SEQ ( Predicted 
ID I beginning 
NO: I nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



6161 



455 



6162 



1081 



6164 



90 



1569 



586 



785 



"406" 



6165 



90 



406 



Amino acid segment containing signal peptide 
(A=Alanine, C= Cysteine, D=Aspartic Acid, E« 
Glutamic Acid, F-Phenylalanine , G^Glycine, 
H=Histidine, I=Isoleucine, K=I*ysine, 
I*=Leucine, Ms=Methionine, N=Asparagine . 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine. T= Threonine , V-=Valine, 
W=Tryptophan, Y=Tyrosine, x=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 

\=possible nucleotide insertion) 

LEAALRGIiLGALTS TP YS PTQHliE REQAXAKQFAEI IiHFTLRFD 
ElJOMTNPAIQNDFSYYRiOT*SRMRlNNVPAEGENEVNNELANRM 
SLFYAEATPMIiKTLSDATTKFVSENKNLP I ENTTDCLS TMASVC 
RVMLETPE YRS RFTNEETVS FCLRVMVGVI I L.YDKVHPVGAFAK 
TSK I DMKG CI KVLKDQ P PNSVEGLIjNAI*RYTTKH1iNDETTSKQI 
KSMItQ * QLLTLVNKG 

PVSGSESSIJIRAI>^I1JUjMIK3PRVAVSIIiCE13GISH*IjI*EKH* " 
KSHVIjEPI»SSIiAIiEEQCIiAI^I*DWSTX3inGRAGDQPIiKI ISSDS 
TGQL^LMVNETRPRiQKVASWQAHQFEAWIAAFWYWHPEIVYS 
GGDDGLLRGWDTRVPGKFIiFTSKRHTWGVCS IQSSPHREHILAT 
GSYDSHILI^WDTRNMKQPLADTPVQGGVWRI KWHP FHKHLUjAA 
CHHSGFKILNCQKAMEERQEATVLT3HTLPDSLVYGADWSWLLF 
RSLQRAPSWSFPSNIXTTKTADLKGASELPTPCHECREDNDGEGH 
ARPQSGMKPLTEGMRKNGTWIiQAT AATTRDCGVN PEEADSAFS I* 
LATCS FYDHALHIiWEWEGN 



RTIHATGRAGAS PMHRIiI VWRI_AEANKQHVRCQKCI_EFGHWTYE 
CIGKRKYXJIRPSRTAELKKALKEKENRl.L.LQQS IGETNVERKAK 
KKRSXSVTSSSSSSSDSSASDSSSESEETSTSSSSEDSDTDESS 
SSSSSSASSTTSSSSSDSDSDSSSSSKQ*HQHR*QL*R*TTKEE 

EKE I EltliHS YWTDGLKTLM 

RIRSTTEGCAVRI^PTQNTGKARIMILI^ 

TPWFVFFFFFFHRKE * VMQKNPMKSREDEWMEKliNNl-CHVQRAD 
MNRLIMNYIjVTEGFKEAAEKFRMESG I EPSVDLETIjDERIKIRE 
MII i KGQIQEAIAI*INSLHPELDDTNRYLYFH_^QHIiIBI.IRQR 
ETEAALEFAQTQI*AEOTEESRECLTEMERTIjaiI_AFDSPEESPF 
GDLLHTMQRQKVW S E VNQAVIJ>YENRESTPKI*AKLI-KL3_.I-WAQN 
EI^KKVKYPK^ITDLSKGVIEEPK 



PCQS PGRS RMRQDKLTGSLRRGGRCLKRQGGG VGT I IjSNVhKKR 
SCISRTAPRI-LCTl^PGVDTKI-KFTIjEPSLGQNGFQQWYDAIi^ 
VARLSTGIPKEWRRKVWLTIiADHYIoHSIAlDWDKTMRFTFWERS 
NPDDDSMGIQIVTCDI-HRTGCSSYCGQEAEQDRVVIjKRVX-LAYA^ 
WNKTVX3Y CQGFNIIAAIjI LBVMISGNEGDAIiKIMI YLIDKVIjPES 
YFVNrn^RALSVDMAVFRDLLRMK^ 

G YEP P LTNVFTMQ W F L>TX> FAT CL»PNQTVIiKI WD S VF FEG S E 1 1 L. 
RVSIAI WAKLGEQI ECCETADEF^STMGRLTQEMLElTOLIiQSHE 
IJ^Q/IVYSMAPFPFPQIjAEIiREKYTYNI TPFPATVKPTS vsgrhs 
KARDSDEEKDPDDEDAVVNAVGCLGPFSGFLAPELQKY QKQI KE 

PNEEQS LiRSNNIAELS PGAINS CRSE YHAAFNSMMM ERPTTTD IN 
ALKRQYSRIKKKQQQQVHQVYIRADKGPVTSILPSQVNSSPVIN 

HLLIiGKlQ^KMTNRAAjCNiAVI H I PGHTGGKI S PVP YEDLXTKLNS 
PWRTHIRVHKKNMPRTKSHPGCGDTVGL IDEQNEAS KTNGLGAA 
E AFPSGCTATAGREGS S PEGS TRRTI EG Q S PEPVFGDADVDVS A 
VQAKLGALEIjNQRDAAAETELRVHPP CQRHCPEP P S APEENKAT 
SKAPQGSNSKTPI FS PFPSVKPIiRKSATARNLGIjYGPTERTPTV 

hfpqmsrsfskpgggnsgp*kmvfssgtmlsrqlpgypqeyqrn 

GGERFG 



PCQSPGRSRMRQDKLTGSL RRGGRCLKRQGGGVGTI VLKKR 
SCISRTAPRLLCTLEPGVDTKXKFTLEPSLGQNGFQQWYDALKA 
VARLSTGIPKEWRRKVWIiTIiADHYI-HSXAIDWDKT^ 
NPDDDSMG I Q I VKDLHRTGCS S YCGQEAEQDRVVLKRVXJjAYAR 
WNKTVG YCQG FNTLAAIiI LEVMEGNEGDAL KI M I YT» IDKVI*PES 
YFVNNIjRALSVD.MAVFRDLIjRN^ 

GYEPPLTNVFTMQWFLTLFATC^PNQTVLKIWDSVFFEGSEIIL 
RVSLAI WAKLGEQI ECCETADEFYSTt^RIjTQEK-IiEiroi-LQSHE 
LMQTVYSMAPFP FPQIjAE LREKYT YNITPFPATVKPTS VS GRHS 
vnpnfiDERNDPDDEDAVVNAVGCLGPFSGFlAPELQKYQKQIKB 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corre sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A= Alanine , C= Cysteine , D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H-Histidine , I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R^Arginine, 
SsSerine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyxosine, X=Unknovn, *=Stop 
Codon, /^possible nucleotide deletion, 
\s=possible nucleotide insertion) 








PNEEQSIjRSNN iaels pgains crs e yhaafns mmmermttdi n 
alkrqysr1 kkkqqqqvhqvy iradkgp vts i lpsqvnss pvin 
huo^kkmk^ni^aaknavihipghtggkispvpyrolktklns 

PWRTH I R VHKKNMPRTKSHPGCGDTVGLIDEQNEASKTNGLGAA 
EAT P S G CTATAG REG SS P EGSTRRT I EGQS PEP VFGHADVDVSA 
VQAKLGALEIiNQRDAAAETEIjRVHPPOQRHCPE P PSAPEENKAT 
SKAPQGSNSKTPIFSPFPSVKPIjRKSATARNLGLYGPTERTPTV 
HFPQMSRSFSKPGGGNSGP*KMVFSSGTMLSROLPGYPQEYQRN 

GGERFG 


6166 


2 


1206 


HKl*WRTVAMAGAEWKSIiEECLEKHliPLiPDliQEVKRVI»YGKEI*RK 
LDLPREAFEAASREDFELG^YAFEAAEEQLRRPRIVHVGLVQNR 
I PLPANAPVAEQVSALHRRI KAI VEVAAMCGVNI ICFQEAWTMP 
FAFCTREKLP WTEFAES AEDG PTTRFCQIOjAKNHDMVVVS PILE 
RD S EHGD VLWNTA VVI SNSGAVLG KTRKNH I PRVGDFNESTYYM 
EGNLGHPVFOTQFGRIAVNICYGRHHPl^LMYS INGAEI I FNP 
ciATTfiATj^ESIAJPIEARNAAIANHCFTCAINRVGTEHFPNEFTS 
GDGKKAHQDFGYFYGSSYVAAPDSSRTPGI*SRSRDGLLVAKLDL» 
NLCQQVNDVWNFKMTCRYEMYARELAEAVKSNYSPTIVKE* PAS 

VPALG 


" '"6167 


1220 


1844 


YG I VTG PSLCAGDKQP KKQE KNP VI»VS PEFVDEAJbCACEE YL»SN 
LAKMDIDKDLEAPLYLTPEGWSIjFIiQRYYQVVHEGAELRHIiDTQ 
VQRCEDI LQQLQAWPQI DMEGDRNI W I VKPGAKSRGRGIMCMD 
HLEEMIiKLVNGNP WMKDG KWWQ KY I ERPI>Ii I FGTKFDI*RQW F 
LVTDWNPltTVWFYRDSY I RFSTQPFSLKNLDK*APLYLTPEGWS 
LFLQRYYQWHEGAE LRHLDTQVQRCEDI LQQI*QAWPQIDMEG 
nPNTWT\HCPGAKSRGRGIMCMDHLEEMIjKLVNGNPVVMKDGKWV 
VQKYIERPLLI FGTKFDLRQW FLVTDWN P LTVWFYRDS Y I RFST 
QPFSLKNLDK 


6168 


84 


1392 

- 


VWPVPSVSAMPPKKQAQAGGSKKAEQKKKEKIIEDKTFGIjKNKK 
GAKQQKF I KAVTHQVKFGQQNPRQVAQSEAEKKLKKDDKKKE LQ 
ELNELFKPWAAQKISKGAD PKS WCAFFKQGQCTKGDKCKFSH 
DLTLERXCEKRS VYI DARDE ELE KDTMDNWDE KKLEE VVNKKHG | 
EAEKKKPKTQIVCKHFIjEAIENNKYGWFWVCPGGGDICMYRHAL 
PPGFA/LKKKKKKKiCKEDE IS L* DLI ERERSAXGPNVTKITIiES F 
lAWKKRKRQEKIDKIJSQDMERRKADFKAGKALVISGREVFEFRP 
ELVNDDDEEADDTRYTQGTGGDEVDDSVSVNDIDLS lyiprdvd 
ETGITVASLERFSTYTSDKDENKLSEASGGRAENGERSDLEEDN 
EREGTENGAIDAVPVDENIiFTGEDIiDEIJEEEIjNTLDLEE 


6169 


112 


662 


APAAAMAERPEDLNLPNAVITRI I KEALPDGVNISKEARSAI SR 
AAS VFVLYATS CANNFAMKG KRKTLNAS DVLSAMEEMEFQRFVT 
P LKEALEAY RREQKG KKEAS EQKKKD KDKKTDS EEQDKS RDEDN 
DE DEBRLEEEEQNE EEEVDN * KGRETVAPWKVPLEMRRATCFCE 
AFPCWAE 


6170 


62 

• 


667 


STKVhttiPNTGRIAGCTVFITGASRGIGKAIAIiKAAKDGANIVIA 
AKTAQPHPKLLGTI YTAAEE IEAVGGKALPCI VDVRDEQQISAA 
VE KAI KKFGG I D I LVNNASAI S LTNTLDTPTKRIiDLMMNVNTRG 
TY1J^KACIPYLKKSKVAHIPNISPPI^LNPVWFKQHCGRW"*VV 

G * GDGI*CL I CFELNLCMSDV I TI CT 


6171 


382 


941 


HFWSDVEL^DIEPCXJHTKFPPTLPIiSTTVIVCSCHPVATAST 
MAEAFSKTTSEEDQSIQEPKEANSMTAQKQKK*GLRGSRRRHAN 
SGGDI FGDS FAAY F PRVLKQ VHQALS LSQEAVSVMD S MVRD I LD 
R IATEAGHLAHYSKCVTI TSRDIRMAVCLLLPGKMGKLAESQGT 
NATLRYTKSK 


6172 


651 


54 


GLCRAGGAHRFSRTHVEAALKMljRREARIjRREYIfYRKAREEAQR 
SAQERKERLRRALEENRXiI PTEIiRREALAJLQGSLEFDDAGGEGV 
TSHVD0EYRWAGVEDPKVMITTSRDPSSRX.3O4FAKELKLVFPGA 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A^Alanine, (^Cysteine, D=Aspartic Acid, E= 
Glutamic Acid , F»Phenylalanine , G=Glycine , 
H=Histidine, I=Isoleucine, K=Lysine, 
L= Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine , T^Threonine, V=Valine, 
W= Tryptophan , Y=Tyrosine # X=Unknown, * — S t op 
Codon, /-possible nucleotide deletion, 
\=»possible nucleotide insertion) 








QRM^GIUiBVGALVRACKANGVTDLI.VVHEHRGTPVGLIVSHI,P 
PGPTAYFTLCNVVMRHDI PDIiGTMSEAKPHLITHGFS SRLGKRV 
SDI LRYliF PVPKDDSHRV IT FANQDD YI S FRHHVY KKTDHRNVE 
LTEVG P RF EL KL YM I RLGTLE Q EATADVEWR WHP YTNTARKR VF 
LSTE*AAPRPLGQLL 


6173 


3 


288 


S VDHRE VQVLSQS M PLT PHQAVLRGERP YMCVECGKC FGRS SHL 
LQHQRIHTGEKPYVCSVCGKAFSOSSVLSKHRTIHTGEKPYECN 
ECGKAFRVSSDLAQHHKI HTGEKPHECLECRKAFTQLSHLI QHQ 
R I HTGERP Y VC PLCG KAFNHS TVLRSHQRVHTGEKPHRCNECGK 
TFSVKRTLLQHQR IHTGEKP YTCSECGKAFSDRSVL IQHHNVHT 
GEKPYECSECGKTFSHRSTI^NHERIHTEEKPYACYECGKAFVQ 
HSHLIQHQKVHRKL* PTCVLSVGSAIiAGVPTSFSISVSTI,ERSP 
MCAVYVGR P SARAQS LVNTGQFTQVRS PMS VMS VEKPLE 


6174 


1060 


959 


PRPPGKRWWAGIjGNPGLPGTRHSVGMAVlXJQIJUlRLGVAESWT 
RDRHCAADliALAPLGDAQLVLLR PRRLMN ANGRSVARAAELFGL 
TAEEVYLVHDELDKPLGRLALKLGGSARGHNGVRSCIS CLNSNA 
MPRLR VG IGRPAHPEAVQAHVLGCFSPAEQELLPLIjLDRATDLX 
LDHIRERSQGPSLGP *H * WFSKKA 


6175 


2204 


334 


RYFRADPRSRSGQPRAEGLGAFAEGPLRAMAAPVXGNRKQSTEG 
DALD PPASP KPAGKQNG I QNP I SI*EDSPEAGGEREEEQER£EEQ 
AFLVSLYKFMKERHTP IERVPHIX3FKQINLWKIYKAVEKLGAYE 
LVTGRRLWKNVYNELGGS PGSTSGATCTRRH Y * RLVLPYVRHL.K 
GBDDKPLPTS KPRKQYKMAKENRGDDGATERPKKAKEERRMDQM 
MPGKTXADAADPAPLPSQEPPRNSTEQQGLASGSSVSFVGASGC 
PEAYKRLLSSFYCKGTHGIMS PIiAKKKLLAQVSKVEAIiQCQEEG 
CRHGAEPQASPAVHLPESPQSPKGLTENSRHRLTPQEGIjQAPGG 
SLREEAQAGPCPAAPI FKGCFYTHPTEVLKPVSQHPRDFFSRLK 
DGVLDG PPGKEGLS VKEPQLVWGGDANRPS AFHKGGSRKG I IiYP 
KPKACWVSPMAXVPAESPTLPPTFPSSPGLGSKRSLEEEGAAHS 
GKRLRAVSPFIiKEADAKKCGAKPAGSGLVSCLLGPALGPVPPEA 
YRGTMLHCPIiNFTGTPGPLKGQAALPFSPLVI PAFPAHFLATAG 
PSPMAAGLhfflFPPTSFDSAXJ^HRLCPASSAWHAPPVTTYAAPHF 
FHLNTKL 


6176 


1040 


402 


PLSAliRAMAEVHVIGQIIGASGFSESSLFCKWGIHTGAAWKI*L»S 
GVREGQTQVDTPQ IGDMAYWS HPI DLHFATKGLQGW PRLH FQW? 
SQDS FGRCQLAG YG FCHVPS S PGTHQIiACPTWRPLGS WREQLAR 
AFVGGGPQLLHGDTIYSGADRYRtiHTAAGGTVHLE IGLLLRNFD 
RYGVEC*GTIjPPTSPPSTPRTPSDGGGWHSGQEHRL 


6177 


1400 


992 


VPIESLVGKVHNFPLIAFYCCEKGKRQPHKSLHDRCFGEALDPN 
CSHCYLDQI KRSDFLGFSGYS PHFVAI STNSEHKMQPSSMQQAL 
PSQ*PYWTDPRPALVPCCSHRPDVHRSRPGPGLPGTSGCSDRPP 
VCPI 


6178 


1027 


254 


STQRGGI KGVARAAS LVGRRRAGTGMALLLCLVCLTAALAHGCL 
HCHSN FS KKFS F YRHHVN FKS WWVGD I PVSGALLTDWSDDTMKE 
IiHLAI PAK I TRE KLJX2 V ATA VYQMMDQ L» YQG KM Y t F0» x 1 h'ri KJL»K 
NIFREQVMLIQNAIIESR1DCQHRCGIFQYETISCNNCTDSHVA 
CFGYNCESSAQWKSAVQGLLNY INNWHKQDTSMRPRSSAFSWPG 
THRAAPAFLVLPALRCIiEPPHLANIiSLEDAA* CLKQH 


6179 


806 


276 


RGETREMAGNLLSGAGRRLWDWVPIiACRSFSLGVPRLIGIRLTL 
PPPKVVDRWNEKRAMFGVYDNI GI LGNFEKHPKELIRGPI WLRG 
W KGNE LQRCIRIO^KMVGSRMFADDLHNIjNKRI RYL YKHFNRHG K 
FR+ KRKLRTS EKAH LS PWRRETVLFPVRXRLCIFSVI KWGFFG I 


6180 


156 


1833 


DHH I LKAAS TTHVCARGN 1 FAI PNTRCLEC* ATATP SS LECQN * 
SHLSLCPLPATTSGLTPNSMI P E KERQN IAERLLRVM CADLGAL 
SWSGKEFLKLAQTLVDSGARYGAFSVTEIIjGNFNT 
MYNQVKVKVTCALGSNACIjG I GVTCHSQS VGPDSCY I LTAYQAE 
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ID 
NO: 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first 

residue of 
amino acid 
secpaence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 

^m mm* mm ^^^tm^^ m^ 

amino acid 
sequence 


Amino acid segment containing signal peptide 
(AnAlanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H^Histidine, I=Isoleucine, K=Lysine, 
L= Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, Rs^Arginine, 
S=Serine r T=Threonine, V=Valine, 
W =Tryp t ophan , Y=Tyrosine, X= Unknown, *»Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








GNHI XSYVI^VKGABIRDSGDLVHHWVQNVI^EFVMSEIRTVyV 
TDCRVSTSAPSKAGMCLRCSACAI^SWOSVLSKRTI^QARSMHE 
VIEIjLNVCEDLAGSTGIAKETFGSLEE^ 

HERYEQI CEFYSRAKKMNLI QSLNKHLLSNLAAI LTPVKQAVI E 
LSNSSQPTI^LVLPTWRLEKljFTAKANDAGrVSKLCHLFLEAL 
KEJ^KVHPAHKVAMILDPQQKLRPVPPYQHEEIIGKVCELINEV 
KESWAEEADFEPAAKKPRSAAVENPAAQEDDRIXSKNEVYDYLQE 
F LFQATPDLFQYWS CVTQKHTKLAKIAFWLIiAV PAVGARSGCVN 
MCEQALLI KRRRLLSPEDMNKLMFLKSNML 


6181 


169 


1032 


TRTIiLSPVIjlaPGPRMKPWRRRPMGPLiAIfPAWLQPRYRKNAYLFI 
YYLI QFCGHSW I FTNMTVRFFS FGKDSMVDTPYAIGLVMRLCQS 
VSLLELLHI YVGI ESNHLLPRFLQLTER III LFWITSQEEVQE 
KYWCVLF VFWNLLDMVRYTYS MLS V I G I S YAVLTWLSQTLWM P 
TVDT^n/T.npaPATYO^I^PYFESFGTYSTKIiPFDIjSIYFPYVLKI 
YLMMLFIGWyFTYSHLYSERRDIIjGIFPIKKKKM*STAFQCDTR 
KDRLWIQCSK*NTGSILVEKFLVF 


6182 


1769 


1224 


AS* IDYQLNTIiIJCEFQLTEENTKI*RYLTCSl»IEDMAAAYFPDCI 
VRPFG SS VNTFGKLGCDIjDM FLDLDETRNLS AHKISGN FLMEFQ 
VKNVPSERXATQKII^Vl^ECLDnFGPGCVGVQKILNARCPL^ 
c Clio A cf2FriPT>I/TTNNT* X AI/TSSELLY I YGALiD S R VRAL VFSVR 
CWARAHSIiTSS I PGAW I TNFSLTMMVI FFLQRRSPP ILPTLDSL 
KTLM>AEDKCVIEGNNCTFVRt)I*SRlKPSQNTETTJEIiIjLKEFFE 
YFGNFAFDKNS IN1RQGREQNKPDSS PLY I QNPFETS LNI SKNV 
SQSQLQKFVDliARESAWILQQEDTDRPSISSNRPWGLVSLLLPS 
APNRKS FTKKKSNKFAI ETVKNLLES LKGNRTENFTKTSG KRT I 
STQT 


6183 


1118 


452 


HLDRYI KS PG S GS STPAPPSHLLL YLLHPQSTRTMGCCG CS RGC 
GSGCGGCGSSCGGCGSGCGGC5GSGRGGCGSGCGGCSSSCGGCGS 
RCYVPVCCCKP VCS WVPACS CTS CGS CGGS KGGCGS CGGS KGGC 
GSCGCSQSS CCKPCCCSSGCGSS CCQSSCCKP CCCQSS CCVPVC 
CX3SSCCKPCCCQSNCCVPVCCQCKI * GSGPRP SGFS CLVXAFLM 
VP 


6184 


1 


2191 


I VT VREEDGAPAVAP PG VWS RANKRS GAG PGGSGGGGARGAEE 
EPPPPLQAVLVADSFDRRFFPISKDQPRVLLPLANVALIDYTLE 
FLTATGVQETFVFCCWKAAQIKEHLLKSKWCRPTSLNWRIITS 
ELYRSIX3DVLRBVDAKALVRSDFLLVYGDVISNINITRALEEHR 
LRRKL* KNVSVMTMI FKESS PSHPTRCHEDNWVAVDSTTNRVL 
HFXJKTG^LRl^AFPI*SLFOX3SSDGVEVRYDLLDCHIS ICSPQVA 
'QLFTDNFDYQTPJDDFVRGlAVNE£IIiGNQIHMHVTAKEYGARVS 
NLHMYSAVCADVI RRWVYPLTPEANFTDSTTQSCTHSRHNI YRG 
PBVSLGHGSIIiEEMVLLGSGTVIGSNCFITNSVIGPGCMIEPGD 
NVVIJDQTYIjWQGWVAAGAQIHQSLLCDNAEVKERVTLK^ 
TSQWVGPNITIiPEGSVISLHPPDAEEDEDDGEFSDDSGADQEK 
DKVKMKGYNPAEVGAAGKGYLWKAAGMNMEEEBELQQNLWGLKI 
NMEEESESESEQSMDSEEPDSRGGSPQMDDIKVFQNEVLGTLQR 
G KEEN IS CDNLVLE INSLKYAYN I SLKEVMQVLSHWLEFPLQQ 
MDS PLD S S R YCALLLPLLKAW S P V FRN Y I KRAADHLEALAAI ED 
PFI^EHEALGISMAKVIjMAFYQLEIIAEETIL^ 
QLRKNQQLQRFI QWLKEAEEESSEDD 


6185 


791 


44 


P CTS CVLWATLHLP ASTRKAPQAECGM I S ITEWQKI GVG I TG FG 
IFFILFGTLLYFDSVLLAFGNLLFLTGLSLIIGLRKTFWFFFQR 
HKLKGTS FLLGGWIVLLRWPLLGMFLETYGFFSLFKGFFPVAF 
G FLGNVCN I PFLGALFRR LQGTS S MV* KTEMSSLNLDHWLKGAK 
RBEWEP P PQS P ALTHS PTYPGP PQVQKERNGAEQLTSNPQVDS R 
GCQEAEMQTPRRLGWGWYHTLTLYLWEEK 


6186 


569 


238 


VYGIDSSNTNT«GA£ERNRKLKKHWKLCHAQSRIJDWGLALI<WA 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 

amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 

CKUXJIL/ C4 W -A. U 

sequence 


Amino acid segment containing signal peptide 
(A-Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, P= Phenylalanine, G=Glycine, 
H«Histidine, I=Isoleuci_ne, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P= Proline, Q=Glut amine. R=Arginine, 
S-Serine, T= Threonine, V= Valine, 
w— iiyptopnan, x = lyiosme , A=unXnown , *— S top 
Codon, /=possible nucleotide deletion, 
\^possible nucleotide insertion) 








KERK\nOJKVKNKADraEVFNNSPTI^EKMPTSAII*PD?SGSVIS "" 
WI RNQMETLHSQPHQEENLCFENS FSIiINLLPINAVEPTSSQQ I 
PNRETSEANKERRKMTSKSSESNI YSPLTSFITADSELHDI 1 KD 
LEDCIjMVGLHTCGDLAPNTLRIFTSNSEIKGVCSVGC^ 
EFE^QHKBRTQEKWGFPMCHYLKEERWCCGRNARMSACIiALERV 
AAGQGI* PTES LFYRAVIjQD 1 1 KDCYG I TKCDRHVG KI YSKCS S F 
I*DYVRRSUCKUGIiDESKLPBKI IMNYYEKYKPRMNELfiAFNMLK 
WliAPCIETLI LLDRIiCYIiKEQED I AWSALVKLFDPVKSPRCYA 
VIAIiKKOQ* FPLKQI IRCISL * D S AG CAE EVS VGDGG P ALiRDAP 
PSGSRVGSRYD 


6187 


1701 


771 


DAWGPETRLARILNPDSFIEPRPGRLPELEATRPHMEPKASCPA 
AAPLMERXFHVLVGVTGSVAALKLPLLVS KLLDI PG LEVAWTT 
ERAKHFYS PQDI PVTI*YSDADE WEMWKS RSDPVLH I DLRR WADL 
LLVAPLDANTLGKVASG ICDNLLTCVMRAVTDRSKPLJbFCPAMNT 
AMVTEHPITAQQVDQLKAFGYVEIPCVAKKIiVCGDEGIjGAMAEVG 
TIVDBCVKEVLFQHSGFQQS* PG I SVMGVPI*YSE WVQAKSVKMDV 
GKIGGYPHL.LNGGPAl.SLPRGQACSRLNVJTEGPGIiSFFQPGEAA 
A 


6188 


238 


1 1534 


KG FV>JAGP IMAELQVS PQWKAP EMS Q I CLS CGHPSA* GPRWAS W 
NlGWICIRC^GIHPJH.GVHISRVXSVNIiDQWTQEQIQCMQEMG 
NGKANRLYEAYLPETFPJIPQIDPAVEGFIRPKYEKKKYMDRSLD 
INAFRKEKDDKWKRGSEPVPEKKbEPWFEKVKMPQKKEDPQLP 
RKSSPKSTAPVMDIiLGliDAPVACS IANS KTSNTLEKDLDLLASV 
PSPSSSGSRKWGSMPTAGSAGSVPENLNLFPEPGSKSEEIGKK 
QIjSKDSIIjSLYGSQTPQMPTQAMFMAPAQMAYPTAYPSFPGVTP 

PNS I MGSMMPPP vgmvaqpgasgmvapmampag ymggmqasmmg 

V PNGMMTTC^AG Y^GMAAMPG/TVTC 

agmnfygangmmnygqsmsggneqaanqtlspqmwk 


6189 


1297 


793 


LGE PLGDLCELI PGDVQQLQMGE VH PGTGAOGS AAQS VAGE VQI» 
TQLSHARQR PS CQGS QI^IAIrDLQHMDI SRQPR WQHVQPVARQVQ 
RAQQAQLABGVAVHLWAGDAWAEVELIXJEVGGGKVFAANACDIi 
VVQDHEGAHAARQATGHAL»QRVI VQVRR VQPLEALi* RVPSGLPR 
RVRAFMILHKQITGIGREDFATTYFLEELNLSYNRITSPQVHRD i 
AFRKXiRI^RSLDLSGNRIjHMLPPGLPRNVHV^ 

GALAG MAQ LRE L YLiTSNRX>RS RALGP RAWVDLAHLQLLD X AGNQ 
I>TE I Pii(jljPKSLiEYIjYIjyNNKISAvPANAFU FljRFN 
KLAVG S WDS AFRRL KHLiQVLiD I EGN L E FGD I S KDRGRLG KEKE 
EEEEDEVEEEETR 






i •> n Q 


1 Jj VGNV SFJjJjS FAE YV CWCS WGS IjNVNRCNQTTGQCE cr pg yq 
GLHCETCKEGFYLN YTSGLCQPCDCS PHGALS I PCNSSGKCQCK 
VGVTGS ICDRCQDGYYGFSKNGCLPCQCNNRSASCDALTGACLN 
CQENS KGNHCE ECKEGFYQS PDATKECLRCPCSAVTSTGS CS IK 
SSEI*E PECDQCKDGYIGPNCNKCENGYYNFDS I CRKCQCHGHVY 
PVKTPKICKPESGECINCIiHNTTGFVICENC^*GYVHDIjEGNCI k 
KVIIjPTPEXSSTILVSNASLTTSVPTPVINSTFTPTTLQTIFSVS 
TSENSTSALADVS WTQFNI I IliTVI 1 1 WVLLMGFVGAVYMYRE 
YQNRKLNAPFWTIELKEDNI SFSS YHDS I PNADVSGLLEDDGNE 
VAPNGQLTLTTPIHNYKA 


6191 


1212 


1511 


VKTLCHGGULHLSTHHI/3IKPSMH*LFFXJ^LSFPHLTPCi3PKCPS 
MIDWIKJCIWYIYTMEYYATIKRNEIMFFAGTWMEMEAJILSKLM 
QDYMFSLISGS 


6192 


3 


950 


TRG CGN KMAG KKNVLiS S LAVY AEDS E PES DGEAG I EA VGS AAEE 
KGGI*VS DAYGEDDFS RLGGDEDG YEEE EDENSRQS EDDOS ETEK 
P EADDP KDNTEAE KRDPQELVAS FS E RVRNMS PDEIKIPPEPPG 
RCSNHLQDKIQKLYERKI KEGMDMNYI IQRKKEFRWPSIYEKLI 
QFCAIDELGTNYPKOMFDPHG WSEDS YYEALAKAQKI EMDKLEK 
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S3Q 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


predicted end 
nucleotide 
location 
cor re s ponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C-Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F» Phenyl alanine, 6=Glycine, 
H=Histidine, I=isoleucine, K= Lysine, 
L= Leucine , K=Methionine, N=Asparagine , 
P= Proline, Q=Giutamine , R=Arginine, 
S = S exine , T=Thr eon ine , V= Va 1 i ne , 
W= Tryptophan , Y=Tyrosine, X= Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








AKKE RTK I S FVTGTKKGTTTN ATS TTTTTAS TAVADAQ KRKS K W 
DSAI P VTT IAQ PT I LTTTATLP AVVTVTTS AS G S KTTV I SAVGT 
IVKKAKQ 


6193 


3 


950 


TRGCGNKMAGKKNVLS SLAVYAEDSSPESDGEAG I EAVGS AAEE 
KGGL.VSDAYGEDDFSRLGGDEDGYESBEDBNSRQSEDDDS3TBK 
PEADDPKDNTEAEKRDPQELVASFSERVRNMSPDEIKIPPEPPG 
RCSNHLQDKIQKLYBRKIKSGMDMNyi I QRKKE FRNPS I YEKLI 
QFCAIDELGTNYPKDMFDPHGWSEDSYYEAJjAKAQKIEMDKIjEK 
AKKERTKI E FVTGTKKGTTTNATS TTTTTASTAVADAQKRKS KW 
DSAI ? VTT I AQPTILTTTATLPAVVTVTTSASGS KTTVI SAVGT 

IVKKAKQ 


6194 


3 


950 


TRG CGNKMAG KKNVLSSLAVYAE DS EPESDG EAG I EAVGSAAEE 
KGGLVSDAYGEDDFSRLGGDEDGYEEEEDENSRQSEDDDSETEK 
PEADDPKDNTEAEKRDPQELVASFSERVRNMSPDEIKIPPEPPG 
RCSNHLQDKI QKLYERKI KEGMDMNY I IQRKKEFRNPSI YEKLI 
QFCAI DEXiGTNYPKDMFDPHGWS EDS Y YEALAKAQKI EMDKIiEK 
AKKERTKI EFVTGTKKGTTTNATS TTTTTASTAVADAQKRKS KW 
DSAI P VTTIAO PT I LTTTATLPA VVTVTTS ASG S KTTVI SAVG T 
IVKKAKQ 


6195 


736 


235 


VANGI*QSNMPKFYCI)YCDTYLTilDSPSVRKTHCSGRKHKENVKD 
YYQKWMEEQAQSL I DKTTAAFQOGKI P PTPFS AP P PAGAM I PPP 
PSLPGPPRPGMMPAPHMGGPPMMPMMGPPPPGMMPVGPAPGMRP 
PMGGHMPMMPGPPMMRP PAR PMMVPTRPGMTRPDR 


6196 


1512 


623 


KTGKRRSAAYVRNILDNAEQVISliljEARNLGPRLTPLLQEEDSH 
QRLIiMGIjMVS ELKDHFLRHLQGVE KKKI EQMVLD Y I S KL LDL I C 
HI VETNWRKKNLHSWVLHFNSRGSAAEFAVFHIMTRI LEATNSL 
FLPI»PPGFHTLHTIIiGVQCXPIiHNLIJICIDSGVI*IjI»TETAVIRL 
MKDLDNTEKNBKLKFSI IVRLPPLIGQKI CRLWDHPMSSNI ISR 
NHVTRI>IiQWYTOCQPRNSMINKSSPSVEFLPLNYFIEILTDIESS \ 
NQAIiYPFEGHDNVDAEFVEEAAI.KHTAMI.LGL 


6197 


3 


819 


ADPEGTESAVMSRYTRPPWTSLFIRNVADATRPEDLRREFGRYG 
P I VD VYI PLD FYTRRP RG F AYVQ FED VRDAE DALYNLNRKWV CG 
RQI E I Q FAQGDRKTPGQMKS KERHPCS PSDHRRS RS P SQRRTRS 
RSSSWGRNRRRSDSLKESRHRRFSYSQSKSRSKSLPRRSTSARQ 
SRTPRRNFGSRGRSRSKSLQKRSKSIGKSQSSSPQKQTSSGTKS 
RSHGRHSDS LARS PCKSPKGYTNFETKVQTAKHSHFRSHSRSRS 
YRHKNSW 


6198 


111 


1912 


SEAALS PS FI SPACFLLRKLPALEDGTLPHPDTLGMN YEGARSE 
RENHAADDSEGGALDMCCSERI.PGLPQPIVMEALDE7LEGLQDSQ 1 
REMPPPPPPSPPSDPAQKPPPRGAGSHSLTVRSSLCLFAASQFL 
IACGVLWFSGYGHIWSOJWTNLVSSLLTLLKQLEPTAWLDSGTW 
GVPSIJt>LVFI^GGLVLVTTLVWHLLRTPPEPPTPLP PEDRRQSV 
SRQPS FTYSBWMEEKIEDDFLDLDPVPETP VFDCVMDI KPEADP 
TS ijTVKS M GLQERRG SNVSLTL DM CTPG CN EEG FGYLM S P RE ES 
AREYLLSASRVLQAEELHEKALDP FLLQAEFFE I PMNFVD PKEY 
DI PGLVRKNR YKT1 LPNPHSRVCLTS PD PDDPLSS Y I NANYIRG 
YGGEEKVY IATQGP I VSTVAD FWRMVWQEKTPI IVKITNIEEMN 
EKCTEYWPEEQVAYDGVE ITVQKVIHTEDYRLRLISLKSGTEER 
GLKHYWFTSW PDQKTPDRAPPLLHLVREVEEAAQQEGPHCAPII 
VHCSAGIGRTGCFIATSICCQQLRQEGWDILKTTCQLRQDRGG 
MIQHCEQYQFVHHVMSLYEKQLSHQSPE 


6199 


144 


1211 


MARENGESSSSWKKQAEDIKKIFEFKETLGTGAFSEVVLAEEKA 

TGKLFAVKCI PKKALKGKESS IENEI AVLRKIKHENIVALEDI Y 
ESPNHLYLVMQLVSGGELFDRIVEKGFYTEKDASTLIRQVLDAV 
YYLHR^IVHRDLKPENIXYYSQDEESKXMISDFGI^KMEGKGD 
VMS TACGT PG YVAPE VLAQKP Y£ KAVDCWS I GVIAYI LLCGYPP 



464 



WO 01/53312 



PCT/USOO/34263 



SEQ 
ID 

NO: 


Predicted 

Hwji nmng 

nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 

nucleotide 

location 

co rre sponding 

to first 

amino acid 

resiaue oi 

amino acid 

sequence 


Amino acid segment containing signal peptide 
jA»Aianine, L-cysceine, u=Asparcic aciq, r.= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=I*ysine, 
L»=Leucine, M=Methionine , N=Asparagine , 
P= Proline, Q=Glutaraine , R=Arginine, 
o-oenne, i-inreoninei v-vaiine , 
W=Tryptophan, Y^Tyrosine, X=Unknown, *«Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








r i jJrJNi>o nXiT 1 1 ljfV>iil.Xc> r L/i> r X ViL)lJ±^U^J\J\JJr xKiMJ_il v lr>rLUP 

NKRYTCEQAARHPWIAGDTALNKNIHESVSAQIRKNFAKSKWRQ 

AFNATAVVRHMRKLHLGSSIJ>SSNA^^ 

HAL* 


6200 


702 

• 

i 


96 


Li PE V fc*HS liKPK VKPHLCCAQ PA VR VMARJjP KJjA V F U Lit) XT-bW A* t" 
WVDTHVDPPFHKSSIXnVRDRRGQDVRLYPEVPEV^ 
PGAAASRTSEIEGANQLIjEIiFDLFRYFVHREIYPGSKITHFERIj 
QQKTG I P FS QMI FFDDERRN I VDVSKLG VTC I H IQNGMNLQTt»S 
OGLETFAKAQTC-PLRSSLEESPFEA 


6201 


2809 


2383 


GQTPR VRWKMRRSLRAGKRRQTAGRKS KS PPKVP I VIQDDSLPA 
GPPPQIRILKRPTSNGWSSPNSTSRPTLPVKSIAQREAEYAEA 
RXRILGSAS PEEEQEKP ILDRPTR I S QPEDSRQPNNVTRQPLGP 
DGSQGFKQRR 


6202 


2 


426 


INADRAAVASSU^RPTRKMAPQKDRKPKRSTWRFNI/DbTHPVE 
DGI FDSGNFEQFIiREKVTCVKGKTGNLGNVVHI ERFKNKITWSE 
KQFSKRYLKYLTKKYXKKNNLRDWLRVVASDKETYELRYFQISQ 
DKDESESED 


6203 


419 


2550 


RCPRPPATAGAAASRPDRS PPSGI SGSEAAAGAGAAAPASQHPA 

TGTGAVQTEAMKQ ILGVI DKKLRNLEKKKGKIiDDYQERMNKGER 

LNQDQLBAVS KYQEVTNNLEFAKELQRS FMALSQDI QKT I KKTA 

RREQLMREEAEQKRIiK7IVLEIjQYVI^K1X5DDEVRTO 

PII^E^LSLLDEFYTCLVDPF,RDMSLRIjNEQYEHASIHLWDIJLE 

GKEKPVCGTTYKVLKEIVERVFQSNYFDSTHNHQNGLCEE3EAA 

SAPAVEDQVPEAEPEPAEEYTEQSEVESTEYVNRQFMAETQFTS 

GEKEQVDEWTVETVE WNS LQQQPQAAS PS VPE PRSLT P VAQAD 

PLVRRQRVQDI^QMQGPYNFIQDSML.DFENQrL.DPAIVSAQPM 

NPT^MDMPQI.VCPPVHSESRI*AQPNQVPVQPEATQVPIjVSSTS 

EGYTASQPLYQPSHATEQRPQKEPI DQIQATI SLNTDQTTASSS 

LPAASQPQVFQAGTSKPLHSSGINVNAAPFQSMQTVFKMNAPVP 

PVNEPETLKQQNQYQAS YNQS FSSQPHQVEQTELQQEQIiQTWG 

TYHGS PDQSHQVTGNHQQPPQQNTGFPRSNQP YYNSRGVSRGGS 

RGARGLMNGYRGPANGFRGGYDGYRPSFSNTPNSGYTQSQFSAP 

RDYSGYQRDGYQQNFKRGSGQSGPRGAPRGRGGPPRPNRGMPQM 

NTQQVN 


6204 


2933 


787 


CTHNLISLLGGRALIHFNRFLNLKIQEGEAHNIFCPAYDCFQLV 

PGD1 1 KS WSKEMDKRYLQFDIBCAFVENNPAI KWCPTPGCDRAV 

RLTKQGSNTSGSDTL^F1>IiIjRAPAVDCGKGHLFGWECIjGEAHEP 

CDCQTWKNWLQKITFMKPEEIjVGVSFAYEDAANCLWLLTNSKPC 

anckspiqkkegcnhmqcaxckydfcwic^ 

YRCTRYBVIQHVEEQSKEMTVEAEKKHKRFQELDRFMRYYTRFK 

NHEHSYQLEQRLLKTAKBKMEQUSRALKETEGGCPDTTFIEDAV 

HVLIiKTRRILKCS YP YGFFTiE PKSTKKE I FEXjMQTDLEMVTEDI* 

AQKVNRP YLRTPRHKI I KAACIWQQKRQEFLAS VARGVAPADS ? 

FIAPRR*5FAGGTWDWEYLdGFAS PEEYAEFOYRRRHRQRRRGDVHS 

IiLSNPPDPDEPSESTLDIPEGGSSSRRPGTSWSSASMSVLHSS 

SLRDYTPASRSENQDSLQALSSLDEDDPNILLAIQLSLQESGLA 

LDEETRDFX^NEASLGAIGTSLPSRLDSVPRNTDSPRAAtiSSSE 

LLEI^DSLMRI/3AENDPFSTDTLSSKPI>SEARSDFCPSSSDPDS 

AGQDPN INDNLLGNIMAWFHDMNPQSIALIPPATTEISADSQIjP 

CIKDGSEGVKDVELVLPEDSMFEDASVSEGRGTQIEENPLEENI 

PGGGKQH PQAW 


6205 


1 


1200 


RAHRGKMAIiEVGDMEDGQLSDSDSDKTVAPSDRPLQLPKVLGGD 
SAMRAFQNTATACAP VSHYRAVESVDS SEEIS FSDSDDDSCLVfKR 
KRQKCFNP PPKPB PFQFGQSSQ KP PVAGGKXINNI WGAVLQEQN 
QDAVATEliG I LGMEGT I DRSRQSETYNYLLAKKLRKE SQEHTKD 
LDKELDEYMHGGKKMGSKEEENGOCHLKRKRPVKDRLGNRPEMN 
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SEQ 
ID 

NO : 


Predicted 

beginning 

nucleotide 

location 

c orre spond lug 

to first 

amino acid 

residue of 

amino acid 

sequence 


Predicted end 
nucleotide 
vocation 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
fil 1 1 1- a m •? «3 F-Phenvl al anine GsGlvcine 

H=Histidine, l=lsoleucine, K= Lysine, 
L=Leucine, M=Methionine, N = Aspa r ag in e , 
P-Proline, Q=Glutamine , R=Arginine, j 
S=Serine , T= Threonine , V=»Val ine , 
W= Tryptophan , Y»Tyros ine , X=Unknown , *=Stop 
Codon, /=possible nucleotide deletion, 
\=pcssible nucleotide insertion) | 








Y KGRYB I TAEDSQEKVADE I S FRliQE PKKDLIAR WR I IGNKKA 
I 3LLMETAEVEQNGGLFI MNGS RRRT PGGVFLNLLKNTPS I £ E S 
QI KDI FYI ENQKEYENKKAARKRRTQVLGKKMKQAX KSLNFQED 

nTYPCDT?Tiri\ CTYT T Kn?IiTi7iQT.riPCrvP^WATI*21VT.P*2XT , PJi T E"l/TYUCVr> 

'1 1 PMrL'-Ar*"-**' 1 U/BO|fEAaWVdfU\ 1 ' ^--^ i r.uiin.inn 

LDIF 


6206 


10 


1442 


1 ISERRERSCLHLVCIRCSCDVVEMGSVLGI^MASW I PCLCGS 
APCXJLCRCCPSGNNSTVTRI*I YALFLLVGVCVACVML I PGMEEQ 
LNKI PGFCENEKGWPCN1 LVG Y KAVYRLCFGLAMFYTJjLSLLM 
I KVKS S S DPRAAVHNGFW FFKFAAAIAI I IGAFFI PEGTFTTVW 
FYVGMAGAFCFILIQLVLLIDFAHSWNESWVEKMEEGNSRCWYA 
AI^ATAIiNYLLSLVAIVLFFVYYTHPASCSElSFKAFISVNMIjl.C 
VGASVMSILPKIQESQPRSGliI*QSSVITVYTMYLTWSAMTNEPE 
TNCNPSIJLtSIIGYNTTSTVPKEGQSVQWWHAQGI IGLILFLLCV 
FYSSIRTSNNSQVNJOjTTjTSDESTLIEDGGARSDGSLEDGDDVH 
RAVDNERIXiV 1 xSYS Ft tu* MLFLASLY I WMT J_> IN Wiki EP^K-tM 
KSQWTAVWVKISSSWIGIVLYVWTLVAPLVLTNRDFD 


6207 
• 


2924 


1471 


T VMAEAATPG TTATTSGAGAAAATAAAAS PTPI PTVTAPSLGAG 
GGGGGSIXSSGGGWTKQVTCTIYFMHGVCKEGDNCRYSHDLSDSPY 
SWCKYFQRGYCIYGDRCRYEHSXPIiKQEEATATEIiTTKSSIiAA 
SSSLSS1VGPLVEMNTGEAESRNSNFATVGAGSFJDWVNAIEFVP 
GQPYCX5RTAPSCTEAPLQGSVTKE3SEKEQTAVETKKQLCPYAA 
VGE CR YG ENCVYLHGDS CDM CGLQ VLHP MDAAQRSQ H I KSC I EA 
HEKDMELS FAVQRS KDMVCG I CMEVVYEKANPSERRFGILSNCN 
HTYCLKCIRKWRSAKQFESKI IKSCPECRITSNFV1PSEYWVEE 
KEEKQKLILKYKEAMSNKACRYFDEGRGSCPFGGNCFYKHAYPD 
GRREEPQRQKVGTS SR YRAQRRNHr*vELIfcfiKENSNPFDNDEE.b 
VV^F3IiGEMLI>IIjIjAAGGDDELTDSEIDEWDLFHDEliEDFYDLDIi 


6208 


2924 


1471 


TVMAEAATPGTTATTSGAGAAAATAAAAS PTP I PTVTAPSLGAG 
GGGGGSDGSGGGWTKQVTCRYFMHGVCKEGDNCRYSHDLSDSPY 
S WCKYFQRG Y C I YGDRCR YEHS K PLKQBE ATATELTTKSSLAA 
SS S LS S XVG PbVEMNTGEAES RNSNFATVGAGS EDW VNAI EFVP 
GQPYCGRTAPSCTEAPLQGSVTKEESEKEQTAVETKKQLCPYAA 
VGECRYGENCVYLHGDSCDMCGLQVLHPMDAAQRSQHI KSCTEA 
t_*T7K"OMPT JQ T77XVOT? CKDMVCGI CM RWYF.KANPS ERRFG I L.SNCN 
HTYCLKCIRKWRSAKQFESK1IKSCPECRITSNFVTPSEYWVEE 
KEEKQKLIIiKYKEAMSNKACKYFDEGRGSCPFGGKCFYKHAYPD 
GRREE PQRQXVGTSSRYRAQRRNHFWELIEERENSNP FDNDEEE 
VVTFEIiGEMIJJ4LIjAAGGDDELTDSEDEWDLFHDELEDFYDIiDL 


6209 


1758 


829 


ERLCFPCMQS KI Y S YMS PNKCSGMRPP LQE ENS VTHHE VKCQGK 
PLAGIYRKREEKRNAGNAVRSAMKSEEQKTKDARKGPLVPFPNQ 
KS RAAEPPKTP PSSCDSTNAAlAKQAIiKKP I KGKQAPRKKAQGK 
TQQNRKLTDFYPVRRS SRKS KAELQSEERKR I DELI ESGKE EGM 
KI DL IDGKGRGVI ATKQFS RGDFVVE YHGDLI E I TDAKKREAI»Y 
AQDPSTGCYMYYFQYLSKTYCVDATRETWRLGRLINHSKCGNCQ 
TKLHD I DGVPHL I LIASRD I AAGEELL YDYGDRS KAS I EAH P WL 
KH 


62X0 

• 


3761 


387 

• 


I FGMS KLRMVLLEDSGSADFRRH FVNL S PFT X TWLLLS ACFVT 
SSLGGTDKELRLVDGENKCSGRVEVKVQEEWGTVOWGWSMEAV 
S VI CNQLGCPTA I KAPGWANS S AGSGR I WMDHVS CRGNE SAL WD | 
OQIDGWGKHSNCTHG^DAGVTCSDGSNLEMRLTRGGNMCS GR I E 
IKFQGRWGTVCDDNFNIDHAS VICRQLE CGSAVS FSGSSNFGEG 
SGPIWFDDLICNGNESALWNCKHCGWGKHNCDHAEDAGVICSKG j 
ADLS LRL VDGVTECSGRLEVR FQGEWGTI CDDG WDS YDAAVACK 
QLGCPTAVTAI GRVNAS KG FGHI WLD S VS (ZQGHE P AVWQ CKHHE 
WGKHYCJ^NEDAGVTCSDGSDLEIiRI^GGGSRCAGTVEVEIQRL 
LGKVCDRGWGLKEADVVCRQLGCGSALKTS YQVYS K I QATNTWL 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A= Alanine, C= Cyst cine, D=Aspartic Acid, E=* 
Glutamic Acid, F=Phenyl alanine, G=Glycine, 
H=Histidine, I^isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q^Glutamine, R=Arginine, 
S=Serine, T=Threonine, V»Valine, 
W =Tryp t ophan , Y^Tyrosine, X=UnJcnovm, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 




• 




FIjSSCNGWETSIjWDCKNWQWGGLTCDHYBBAKITCSAHREPRLV 
GGDIPCSGRVEVKHGDTWGS 1 CDSDFSLEAAS VLCRELQCGTW 
SILGGAHFGEGNGQIWAEEFQCEGHESHLSL.CPVAPRPEGTCSH 
SRDVGWCS R YTE I RLVNGKTPCEGRVELKTLGAWGSLCNSHWD 
I EDAHVLCQjQIjKCC VALSTPGGAR FGKGNGQI WRHMFHCTGTEQ. 
HMGDCPVTALGASLCPSEQVASVICSGNQSQTIiSSCNSSSLGPT 
RPTIPEESAVAC1ESGQLRLVNGGGRCAGRVEIYHEGSWGTICD 
DS WDLSDAHWCRQLGCGEAINATGSAHFGEGTG P I WLDEMKCN 
GKESRIWQCHSHGWGQQNCRHKEDAGVICSEF*MSLRLTSEASRE 
ACAGRLEVFTOGAWGTVGKSSMSETTVGVVCRQLGCADKGKINP 
ASLDKAMS I PMWVDNVQCPKGPIJTLWQCPSSPWBKRLASPSEBT 
W I TCDNK I RLQEG PTS CSGRVEI WHGG SWGTVCDDS WDLDDAQV 
VC(^I^CGPALKAFKEAEFGQGTGPIWIiNEVKCKGNESSLWDCP 
ARRWGHSECGHKEDAAVNCTDISVQKTPQKATTGRSSRQSSFIA 
VG ILGWLLAI FVALFFLTKKRRQRQRIAVSSRGENLVHQIQYR 
EMNSCLNADDLDLMNSSGGKSEPH 


6211 

* 


3761 


387 


I FGMS KLRM VLLEDSGS ADFRRHF VNLS PFTITWLLLSACFVT 
SSI^GTDKELRLVTKSENKCSGRVEVKVQEEWGTVCNNGWSMEAV 
SVICNQLGCPTAI KAPG WANS SAG SGRI WMDIIVS CRGNE SALWD 
CKHDGWGKHSNCTHQQDAGVTCSDGSNIiEMRLTRGGKMCSGRIB 
IKFOGRWGTVCDDNFNIDHASVICRQIiECGSAVSFSGSSNFGEG 
SGPIWFDDLI CNGNESALWNCKHQGWGKHNCDHAEDAGVI CSKG 
ADLSLRLVDGVTECSGRLEWFC^EWGTIOTDGWDSYDAAVACK 
QLGCPTAVTAI GRVNAS KGFGHI WLDSVSCQGHEPAVWQCKHHE 
WGKHYCNHNEDAGVTCSDGSDLELRLRGGGSRCAGTVEVEIQRL 
LGXVCDRGWGLKEADVVCRQLGCGSALKTSYQVYSKIQATNTWL 
FLS S CNGNETS LWDCKNWQWGGLTCDH YEEAKI TCS AHRE PRLV 
GGDIPCSGRVEVKHGDTWGSICTSDFSLEAASVLCRELQCGTVV 
SILGGAHFGEGNGQIWAEEFQCEGHESHLSIiCPVAPRPEGTCSH 
SRDVGVVCS R YTE I R^VNGKTPCEGRVELKTLGAWGSLCNSHWD 
lEDAHVLCQQLKCGVALSTPGGARFG KGNGQI WRHMFHCTGTEQ 
HMGDCPVTALGASLCPSEQVASVI CSGNQSQTLSSCNSSSLGPT 
R PT I PEES AVACI ESGQLRLVNGGGR CAGRVE I YHEGS WGT I CD 
DSWDLSDAHWCRQLGCGEAINATGS AHFGEGTGP I WLDEMKCN 
GKESRIWQCHSHGWGQQNCRHKBDAGVICSEFMSLRLTSEASRE 
ACAGRLEV F YNG AWGTVG KS S MS ETTVG VVCR QLG CADKGKINP 
ASLDKAMS I PMWVDNVQCPKGPpTLY7QCPSSPWEKRLASPSEET 
W I TCDNKI RLQEG PTS CSGRVE I WKGGS WGTVCDD S WDLDDAQV 
VCQQliGCGPALKAFKEAEFGQGTGPIWLNEVKCKGNESSLWDCP 
ARRWGHSECGHKEDAAVNCTDI S VQKTPQXATTGRS SRQSS FIA 
VGILG WLLAI FVALFFLTKKRRQRQRLAVSSRGENLVHQ 1 Q YR 
EMNSCLNADDLDLMNSSGGHSEPH 


6212 


1 


1134 


LKWBLRPGGAVWGTGRGAGTGAPRSCCCQTNPGP P S S LRRAFRR 
RELPFPACHEIGLGAEAGSGPPPAPAARESRSRAMEEEASSPGL 
GCSKPKLEKLTLGITRILBSSPGVTEVTIIEKPPAERHMISSWE 
QKNNC^PEDVKNFYLMTKGFHMTWSVKLDEHI IPLGSMAINS I 
S KLTQLTQS SM YSL PNAPTLADLEDDTHEAS DDQPEKPHFDS RS 
VI FELDS CNGSGKVCLVYKSGKPALAEDTEIWFLDRALYWHFLT 
DTFTAYYRLLI THLGL PQWQYAFTS YG I S POAXQRVSMYKP ITY 
NTNLLTEETDSFATNKLDPSKVFKSKNKIV1PKKKGPVQPAGGQK 
GPSGPSGPSTSSTSKSSSGSGNPTRK 


6213 


1 


1134 


LKWELRPGGAVWGTGRGAGTGAPRSCCCQTNPGPPSS LRRAFRR 
RELPFPACHE IGLGAEAGSGPP PAPAARESRSRAMEEEASSPGL 
GCSKPHLEKLTI/3ITRILESSPGVTEVTI IEKPPAERHMISSWE 
QK2WCVMPEDVKNFYLMTNGFHMTWSVICLDEHIIPLGSMAXNSI 
SKLTQLTQSSMYSLPNAPTLADLEDDTHEASDDQPEKPHFDSRS 
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SEQ 
ID 

NO: 


Predicted 

beginning 

nucleotide 

location 

c or re sponding 

to first 

amino acid 

residue of 

amino acid 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A= Ala nine , C=Cys teme , D=Aspartxc Acad, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H-Histidine, I-Isoleucine, K= Lysine, 
1,= Leucine , M=Me thionine , N=Asparagine , 
p=Proline, Q=Giutamme, KsAigiiunc< 
S=Serine, T*Threonine, v=Valine, 
W-Tryptophan, Y=Tyrosine. X= Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 






1 

1 


V I FELDS CNG SG KVCLV YKSGKPALABDTE I W FLDRAL Y WHFLT 
DT FTAYYRLL I THLGL PQ WQYAFTS YGI SPQAKQR VSMY KP I T Y 
NTNLLTEBTT)SFVNKIJDPSKVFKSKNKTVIPKKKGPVQ 
GPSGPSGPSTSSTSKSSSGSGNPTRK 


6214 


2 


460 


HEIAPSAIRRAARIjGLGPARWQSRAAAFYFVRGFRTGWSFVGWV 
VIX^TSAKRTRLFFFI^KMAASSRAQVIJ^YRAMIiRBSia^FSAYN 
YRTYAVRR I R DAFRENKNVKD PVE I QTLVN KAKRD LG VI RRQ VH 
I GQLYSTDKLI I ENRDMPRT 


6215 

* 


2 


1B49 


FVAGG PRGSGSAAETMPE I RVTPLGAGQDVGRSCI LVSIAGKNV 
MIiDCG^IHMGFNDDRRFPDFSYI^XJNGRttTDFLDCVIISHFHLIIH 
CGALP YFSEMVGYDGP I YMTHPTQAI CPILLEDYRKIAVDKKGE 
ANFFTSQMIKDCMKKWAVHLHQTVQVDDELEI KAYYAGHVLGA 
AMFQI KVGS ES WYTGDYNMTPDRHLGAAWIDKCRPNLL I TEST 
YATTIRDSKRCRERDFLKKVHETVERGGKVXiI PVFALGRAQELC 
lXLETFMERhlNLKVPIYFSTGLTEKANHYYKLFIPWTNQKIRKr 
FVQRNMFEFKHI KAFDRAFADNPG PMWFATPGMLHAGQS LQI F 
PJCWAGNEKNMVIMPGT^CVQGTVGHKILSGQRKLEMEGRQVLEVK 
MQVE YMS FSAHADAKG I MQLVGQAE P ES VLLVHGEAKKME FLKQ 
KIEQEIiRVNCYMPANGBTVTLPTSPSIPVGISIiGIjLKREMAQGL 
LPEAKKPRIiLHGTUMKDSNFT^VSSEQALKEIiGLAEHQLRFTC 
RVlujHIXrRkBQETAIiRVYS HLKSVLKDHCVQHLPDGSVTVES VL 
LQAAAPSEDPGTKVLLVS WTYQDEELGS FLTSLLKKGLPQAP S 


6216 


11 


393 


C^TTR^EPRNSALRQSRSKMAVVGVSSVSRIjLGRSRPQLGRPMSS 
GAHGE EGSARMW KTLTFFVALPGVAVSMLNVYLKSHHGEHERPE 
FIAYPHLRIRTKPFPWGDGNHTLFHNPHVNPLPTGYEDE 


6217 


9 


1178 


TRVGRGESGLKMEVKPPPGRPQPDSGRRRRRRGEEGHDPKEPEQ 
IJIKLFIGGI^FETTDDSLREHFBICWGTLTDCVVMRDPQTKRSRG 
FGFVTYS CVE BV15AAMCARPHKVDGRVVEPKRAVSREDSVKPGA 
HLTVKKI FVGG I KEDTEEYNLRDYFEKYGKIETI EVMEDRQSGK 
KRGFAFVTFTJDHmVDKIWQKYHTINGHNCEVKKALSKQENQS 
AGSQRGRGGGSGNFMGRGGNFGGGGGNFGRGGNFGGRGGYGGGG 
GGSRGS YGGGDGGYNGPGGDGGNYGGGPGYS S KCaC* x<Mj»fc?tji?<jiC» 
NQGGGYGGGGGYJX5YNF^3GNFGGGNYGGGGNYNDFGNYSGQQQS 
NYGPMKGGSFGGRS&GSPYGGGYGSGGGSGGYGSRRF 


6218 


13 OS 


906 


" S CERRG F I MADDLKRFL Y KKL PS VEG LHAI WS DRDG VP V I KVA 
NDNAPEHALRPGFLSTF7aATDG^SKLGLSKNKSI ICYYNTYQV 
VQFNRLPLWSF XASSoANTvjJLi i. v^LiE.rJ^UAt'ljr r«EiJ-»rcu v vuva 


6219 


2 


890 


AGPGEGAGAGTRCAGAEAEMASAGGEDCESPAPEADRPHQRPFL 
IGVSGGTASGKSTVCEKIMELLGQNEVEQRQRKWILSQDRFYK 
VLTAEQKAKALKGQYNFDHPDAFDNDLMHRTUOSriVEGKTVEVP 
TYDFVTHSRLPETTVVYPADVVLFFX3ILVFYSQEIRDMFHLRX.F 
VDTDSDVRLSRRVLRDVRRGPJ5LEQILTQYTTFVKPAFEEFCLP 
TKKYADVI I PRGVDNMVAINLIVQHIQDILNGDICKWHRGGSNG 
R <5 YTTRTF^ F PGDHPGMLTS GKRSHLESS SRPH 


6220 


227 


764 


EQNISLEMSCTIEKALADAKALVERLRDHDDAAESIil EQTTALN 
KRVEAMKQYQEEIQEliNEVARHRPRSTLVMGIQQENRQIRELQQ 
ENKELRTSLEEHC^ALELIMSKYREQMFRLLMASKKDDPGIIMK 
LKEQHS K I DM VHRN KS EGF FLDAS RH I LEAPQHGLERRHLEANQ 
NVH 


6221 


98 


916 


RWIWDI^PVSDGIJBIjRPKYNGILHCLTTIWKL^ 

NI WGAGLS WGLYFVFYNAI KS YKTEGRAERLEATE YLVS AAEAG 

AMTLC I TN P LWVTKTRLM LQ YDA WNS PHRQ YKGMFDTL VK I Y K 

YEGVRGLYKGFVPGLFGTSHGMjQFMAYELLKIiKYNQH INRLP E 

AQLSTVEYI SVAALS KI FAVAATYP YQWRARLQDQHMF YSGVI 

DVITKTWRKEGVC^FYKGXAPNLIRVTPACCITFVVYENVSHFL 

LDLREKRK 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A*>Alanine, C-Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F^Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K= Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T-Threonine, V=Valine, 
W -Tryptophan, Y=Tyrosine, X=UnJcnovro, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 


6222 


2 


2116 


MARELRAT.T J iWGRRIiRPLTiRAPALAAVPGGKP I IjCPRRTYAQLG 
PRRN PAWSLQAGRJaFSTQTAED ECEE PLHS I ISSTESVQGSTSKH 
EFQAETKKIJ^IVARSLYSEKEVFIRELISNASID^EKLRHKLV 
SDGQALP EM EI HLQTNAEKGT I TIQDTG IGMTQEEL V SNLGTIA 
RSGSKAFLDALQNQAEASSKI I GQFGVGFYS AFMVADRVEVYSR 
SAAPGSLGYQWLSDGSGVFEIAEASGVRTGTKI I IHLKSDCKEF 
SSEARVRDVVTKYSNFVSFPLYI^GRRMNTI^AIVfMMDPKDVRE 
WQHEEPYRYVAQAHDKPRYTLHYKTDAPLNIRS I FYVPDMKPSM 
FDVSRELGSSVALYSRKVLIQTKATDILPKWLRFIRGVVDSEDI 
PLNLSRELLQESAL IRKLRDVLQQRLI KFF I DQSKKDAEKYAKF 
FEDY^^FMR3GIVTATEQEVKEDIAKLLRYESSAI»PSGQI»TSLS 
E YASRMPJu^TRNI YYLCAPNRHLAEHS PYYEAMKKKDTEVLFCF 
EQFT)EIiTLLHLREFDKKKLISVETDIVVDHYKEEKFEDRSPAAE 
CLS EKETEEI*MAWMRNVI/^RVTfTVKVTLRLDTHPAhrVTVI*EMG 
AARHFIJ^MQQLAKTQEERAQLLQPTLE I N PRHAL I KKIiNQLRAS 
E PGIjAQIjLVDQI YENAM IAAGIiVDDPRAMVGRU4EIjIjVKAIjER^ 


6223 


3 


715 


DAWARTMAGMVDFQDEEQ VKS FLENMEVE CN YHC YHEKD PDGCY 
RLVDYLEGIRKI^DEAAKVLKPWCEEKQHSDSCYKI/BAYYVTGK 
GGLTQDLKAAARCFLMACEKPGKKSIAACHNVGLIJU1DGQVNED 
GQPDLGKARJD YYTRACDGGYTSS CFTJI*SAMFLQGAPG FP KD>5DI* 
ACKY SMKACDLGH I WACANASRMYlCLGDGVDKVEAKAEVXiKNRA 
QQVHKEQQKGVQPLTFG 


6224 


1 


133 


LRTISSMAWG PLLLTLLAHCTGS WAQSVLTQPPSVSGARI PHEK 


6225 


3259 


938 


LliSaRIiAICKJ^FSVESRKTVMGPQGARRQAFLAFGDVTVDPT 
QKEWRIiLS PAQRALYREVTLENYSHLVSLGILHS KPELIRRLEQ 
GEVPWGEERRRRPGPCAGIYAEHVLRPKNLGIiAHQRQQQLQFSD 
QSFQSDTAEGQEKEKSTKPMAFSSPPLRHAVSSRRRNSVVEIES 
SQGQRENPTE I DKVLKGI ENSRWGAFKCAERGQDFSRKMMVI IH 
KKAHSRQKLFTCRECHQGFRDESALIjIJHQNTHTGEXSYVCSVCG 
RGFSI*KANIiliRHQRTHSGEKP FLCKVCGRGYTS KS YLTVHERTH 
TGEK?YECQECX5RRFNDKSSYNKHL.<AHSGEKP FVCKECJvjHLiYT 

nksyfwhkrihsgekpyrcqecgrgfsnkshlithqrthsgek 
pfacrqckqs fsvkgsllrhqrthsgekpfvckdcersfsqkst 
lvyhqrths gekpfvcrecgqgfiqks tlvkhq i thsee kpfvc 
ki)cx;rgfiqkstftlhqrthseekpygcrecgrrfrdkssynkh 
lrahlgekrffcrdcgrgftlkpnltihqrthsgekpfmckqce 
KS FS lkanllrhq wthsgerpfnckdcgrgfi lks tllfhqkth 

SGEKP P I CSECGQGF I WKSNLVKHQLAHSGKQ P FVCKECGRG FN 
WKGNLIiTHQRTHSGEKPFVCNVCGQG FSWKRSI/FRHHWR IHS KE 
KPFVCQECKRG YTS KSDLTVHERI HTGERPYECQECGR K FSNKS 
YYSKHIiKRHLREKRFCTGSVGEASS 


6226 


29 


266 


TKVS ELtLGGS QRX»FFI*PLWRRLCRCGLGPRVS PMAGPRVEVDGS 
IMEGGGQSLRVSTGLSWLLSLPWRAQRIRAGRSYA 


O 4. Z J 


j 




" mq as <n .t^phr p kg ognkvongs VHOKDGLNDDD FE ? YX»S PQAR P 
NNAYTAMSDS YLPS YYS PS IGFSYSLGEAAWSTGGDTAMP YbTS 
YGQLSNGEPKFI*PDAMFXK)PGJUVGSTPFIiGQHGFN?FPSGIDFS 
AWG1TOSSQGQSTQSSGYSSNYAYAPSSLGGAMIDGQSAFANETL 
NKAPGMNTIDOGMAALKLGSTEVASNVPKWGSAVGSGS I TSNI 
VASNSLPPATIAPPKJPASWADIASKPAXQQPKLKTKNGIAGSSL 
PPPP I KHNMD I GTWDNKGPVAKAPSQAL VQNIGQPTQGSPQP VG 
QQANNSPPVAQASVGQQTQPLPPPPPQPAQIjSVQQQAAQPTRWV 
APRNRGSGFGHNGVDGNGVGQS QAGSGSTPSEPH PVLEKLRS IN 
NYNPKDFDWNIiKHGRVFI IKSYSEDD I HRS I KYN I WCSTEHGNK 
RLDAAYRSMNGKG PVYIjLFSVNGSGHFCGVAEMKSAVDYNTCAG 
VWSQDKWKGRFDVRW I FVKDVPNSQI*RH I PIjENNENKPVTNSRD 
tqevplekakqvlki I ASYKHTTS I FDDFSHYE krq 
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SSQ 

SO: 


Predicted 
beginning 
nucleotide 
location 
corre sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Ami no acid segment containing signal peptide 
(A=Alanine, C= Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine. 
I>= Leucine, M=Methionine, N=Asparagine, 
P= Proline, Q=G lutaraine, R=Arginine, 
S=Serine, T^Threonine, VWaline, 
W=Tryptophan, Y=Tyrosine, X= Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\ -possible nucleotide insertion) 


6228 


47 


1978 


GRRC^RRGAVMEIiAQEARELGCWAVEEMGVPVAARAPESTLRRX. 
CLGQGADI WAYILQHVHSQRTVKKI RGNLLWYGHQD S PQVRR KL 
RI.EAAVTR T.TJAF TORTiDOS"LELM ERDTEAODTAMEQAROHTODT 
QRRALLLRAQAGAMRRC^HTLRD PMQRLQNQLRRLQDMER KAKV 
DVT FGSLT S AALGLE PVVLRDVRTACTLRAQFI^NLI^PQAKRG 
SLPTPHDDHFGTSYQQWI^SVETIjLTNHPPGHVLAAIjEHLAAER 
EAE I RS LCSGDGLG DTE I S RPQAPDQSDS SQTLPSMVHL IQEG W 
RTVGVLVSQRSTLLKERQVLT0RLCX5LVEEVERRVLGSSERQVL 

I I^LRRCCLWTELKALHDQSQELQDAAGHRQLLLRELQAKQQR I 
LHWRQLVEETQEQVRLLI KGNSAS KTRLCRS PGE VLALVQRKVV 

DTTTrmm DflODCT T OPT PtrC\rPin.DUTT .T ./^TT .T .PTTP 'DfZ'^'T .V0T.P 
r If x^i*v>\lrsJoKEkJbJbK^u£»ILa>VKn^ 

TVLPSIHQLHPASPRGSSFIAI^HKIXSLPPGKASELLLPAAASL 
RQDLLiLQDQRSLWCWDLLHMKTSIiP PGLPTQELLQ I QASQEKQ 
QKENLGQALKRLEKLLKQALERI PELQG I VGDWWEQPGQAALSE 
ELCQGLSL PQWRLRWVQAQGALQKLCS 


6229 


1571 


560 


GPS LLGTRGTPNPARTLQIFFL I IGRJU/TGRMAAVDDLQFEEFG 
NAATSLTAN PDATTVNIEDPGETPKHQPGS PRGSGREEDDELLG 
NDDSDKTELLAGQKTs3S PFWTFE YYQTFFDVDT YQVFDR I KGSL 
LP I PGKfcJFVRLYIRSNPDLYGPFWI CATLVFAIAISGNLSNFL1 

VMNIVSYSFLEIVCVYGYSLFIYI PTA1LW 1 1 PHKAVRWIDVMI 
ALG I S G S LLAMTFW PAVREDNRR VALAT I VT I VLLHMLLS VG CL 
AYFFDAP EMDHLiPTTTATPNG/TVAAAKS S 


6230 


1723 


600 


SKMSGRSGKKKMSKLSRSARAGVI FPVGRLMRYLKKGTFKYRIS 
VGAPVYMAAVIEYIJULEILEIJVGNAARJDNKXARIAPRHILLAVA 

PPEKRGRKATSGKKGGKKSKAAKPRTSKKSKPKDSDKEGTSNST 
SEDGPGDGFTILSSKSLVLGQKLSLTQSDISHIGSMRVEGIVHP 
TTAEIDLKEDI GKALEKAGGKEFLETVKELRKSQGPLEVAEAAV 
SQSSG1AAKFVIHCHI PQWGSDKCEEQLEET I KNCLSAAEDKKL 
KSVAFPPFPSGRNCFPKQTAAQVTLKAISAHFDDSSASSLKNVY 
FLLFDS ESIGI YVQEMAKLDAK 


6231 


149 


870 


LIFSSSTMDRS1JINVLWSFGFLLLFTAYGGLQSLQSSLYSEEG 
LGVTALSTLYGGMLLSSMFLPPLLI ERLGCKGTI ILSMCGYVAF 
SVGNFFASWYTLIPTS I LLGLGAAPLWS AQCTYLTI TGNTHAEK 
AGKRGKDMVNQYFGI FFLI FQSSGVWGNLI S SLVFGQTPS QETL 
PEEQLTSCGAS DCLMATTTTNSTQR PSQOLVYTLLG I YTGSGVL 
AVLMIAAFLQP IRDVQRESE 


6232 

• 


3679 


1476 


pvagttmag fwvgtaplvaagrrgrw ppqqlmls aalrtlkhvl 
yys rq clmvs rnlg svg yd pne kt fd k i lvanrg e i acr v i rt c 
kkmgixtvaj:hsdvdassvhvi<madeavc^^paptsksy^ 

IMEAIKKTRAQAVHPGYGFI^ENKEFARCLAAEDVVFIGPDTHA 
IOAMGDKIKSIOjLAKKAEVNTIPGFDGVVKDAEEAVRIARE I GY 
PVMIKASAGGGGKGMRIAWDDEETRDGFRLSSQEAASSFGDDRL 
LIEKF IDNPRHIEIQVLGDKHGNALWLNERECS I QRRNQKWEE 
APS I FLDAETRRAMGEQAVALARAVKYSSAGTVEFLVDS kknfy 
FLEMNTRLQVEHPVTECITGLDLVQEMIRVAKGYPLRHKQADIR 
INGWAVECRVYAEDPYKSFGLPSIGRLSQYQEPLHLPGVRVDSG 
IQPGSDIS I YYDPMISKLITYGSDRTEAIiKRMADALDNYVI RGV 
THNIALLREVT insrfvkgdi STKFLS DVYPDGFKGHMLTKS ek 

nqllaiasslfvafqlraqhfqensrmpvikpdianwexsvklh 
dkvhtwasnngsvfs vevdgsklnvtstwnlas pllsvsvdgt 
qrtvqcls reaggnms iqflgtvykvn i ltrlaaelnkfmle kv 
tedtssvlrs pmpg wvavs vkpgdavaegqei cvi eamkmqns 

MTAGKTGTVKSVHCQAGDTVGEGDLLVELE 


6233 


1 


2654 


HSTRENLNAGNFNF PS EGHLVRSTGPGGS FAKHMVAQCVSP KG P 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A= Alanine, C= Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=proline, Q=Glucaraine , R=Arginine, 
S=Serine, T= Threonine , V=Valine, 
WssTryptophan, Y= Tyrosine, X=Unknown , *~Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 


• 






LACSRT YFFGATHVP YIXSGDS KLPKKTEQ IRTJ»SQ I YAAV3 EAV 
LAG IACYAKTSS LTKAKEVAEQTLGSGLDS FELI P FKAALRS KM 
TFH IHAVNNQGR I VP LDS ED S L»S FVKTACMAVYD I PDLI/3GNGC 
LGSWFSES FIiTSQ I LVKEKDGTVTTETSS WLTAAVPRFCS WL, 
VEDNEVKLS EKTHQAVRGDES FLGTYLVTGG EGAYI> YS SNLQS WP 
EEGNVHFFS SGLVLFSHCRHGS 1 1 1 SKDHMNS I S FYDGDSTS TVA 
AlJ^IDFKSSL^HIiPVHFHGSSNFIiMIALFPKSKIYQAFYSEVF 

SLWKQQDNSGI SLKV I QEDCiltS V&QlUUanboAyiUjr ^ivl^yr'/ivj 
EKRSSLiCbIiSAKIiPEIiDWFI*QHFAI SS I SQEPVMRTHLPVLLQQ 
AEINTTHRIESDKVIISIVTGLPGCHASELCAFLVTLHKECGRW 

MVYRQlMDSSECFHAAHFQRYLSSALEAQQNRSARQSAyiRKKT 
RLLVVLQGYTDVI DVVQALQTH PDSNVKAS FTIGAITACVEPMS 
CYMEHRFIjFPKCIJJQCSCCLVSNVVFTSHTTEQRHPIJjVQIjQSL. 
I RAANPAAAFI LAENG IVTRNED I EI*I LS ENS FSS PEMLiRS R Yl» 
MYPGWYEGKuNAGS VYPLMVQ I CVWFGRPLEKTRFVAKCKAI OS 
S I KPS P FS GNI YHI LGKVKFS DS ERTMEVCYNTLANS LS I MP VIi 
EGPTPPPDSKSVSQDSSGQQECYI*VFIGCSI*KEDS IKDWLRQSA 
KQKPQRKALKTRGML.TQQEI RS I HVKRHL»E PLPAG YFYNGTQFV 
NFFGDKTDE^PLMDQFMNDYVEEANREI EKxNQEIiEQy fc- x WUJjr 
ELKP 


6234 


1731 


404 


PRVRJEXMDHKSPGNKGSDVYAGIKSIVKSSLGMVESSRHNWSGL. 

DKQSDIQNLNEERI IiALQLCGWI KKGTDVDVGPF1»NSI»VQEGEW 

ERAAAVALFKLDI RRAI QIIiNEGASSE KGDlj31jNWAMAX»SGYT 

DE KNSLWRET^CSTLiRIjQLNNPYLCVMr AF IiTS ETtro X UL» v u x jsm 

KVAVRDRVAFACKFLS DTQLNRY I EKLTN EMKEAGNL EG I LiLTG 

LTKDGVDIiMES YVDRTGD VQTAS YCMT >Q G S PLDVLKDERVQYW I 

ENYRNLLDA^FVraKRAEFDIKllSKLDPSSKPIAQVFVSCNFCG 

KSISYSCSAVPHQGRGFSQYGVSGSPTKSKVTSCPGCRXPLPRC 

ALCLINMGTPVSSCPGGTKSDEKVDLSKDKKIAQFl^NWFTWCHN 

✓"tT>TT/-"/*-»t_iTs /~»T_r?u?T eTJt?D"T4tiTk PPOVQ aPTP VfMnT inTTGNLVPAETV 
CKnwwxiAwxlNuoWlr KJJxlMiiW.ir v o/*^.x v»i\.v— ri^i-tux iw»u« * * » 

QP 


6235 


1 


571 


EKRDHRLPSWPRAAIiKVPGRGGRVGTTPELAAGGIMATRNPPPQ 
DYESDDDSYEVLDLTE YARRHQWWNRVFGHS SGPMVE KYSVATQ 
TirMr:mrrrswf , a.r , 'FT .T7nTT\7T:KT.7VATAVGGGFLLL»OIASHSGYVC I 
DV7KRVEKDVNKAKRQ I KKRANKAAPE INNLI EEATEF I KQNI V I 
SSGFVGGFL.EGLAS 


6236 

• 


1 


703 


WDQNKGAAAGSGLTliPSLPSARFSAGPPTQRSRPTMSNMEKHL.F 

NIiKFAAKELSRS AKKCDKE EKAE KAICC KKAI QKGNMEVAR ILLAE 
NAI RQKNQAWFLRI^ARVDAVAARVQTAVTMGKVTKS WAGVVK 
SMDATLKTMNLEKI S AI^D KFEHQFETLDVQTQQMEDTMS STTT 
LTTPQNOVDMLLQEMADEAGXDLNMELPO^QTGSVGTSVASAEQ 

DELSQRLARLRDQV 


6237 


312 


720 


PTAMAEEGIAAGGVMDVNTALfQBVLKTAIjIHDGLARG IREAAKA 

LDKROAHIiCVLASNCDE PM WKLVEALCABHQINL»I KVDDNKKL 
GEWVGLCKIDREGKPRKWGCSCVVVKDYGKESOAKDVIEEYFK 

CKK 


6238 


? 


4666 


EBVPTQESVKWEINVI IKNPEIVFVADMTKNDAPAL.VITTQCEI 
CYKGNL»ENSTMTAAI KDLQVRACP FLPVKRKG KI TTVLQPCDIjF 
YQTTQ KGTD PQV I DM S VKS LTL KVS ? VI INTMITITSALYTTKE 
TIPEETASSTAHLWEK20>TKTLKMW?T^E^ 

KGEMIKMNIDSIFIVLEAGIGHRTVPMIilAKSRFSGEGKNX^SSL 
INUICQIiEIiEVHYYNEMFGVWEPLIiEPLEIDQTEDFRPWNLGI K 
MKKKA1CMAIVESDPERENYKVPEYKTVISFHSKDQLNITLSKCG 
LVMIiNNL»VKAFTEAATGSS ADFVKDLAPFK I LNSLGLT I SVS PS 
DS FSVLNI PMAJCS YVLKNGE SI>SMDYI RTKDNDHFNAMTS1*S S K 
LFFI1.L.TPVNHSTADKI PLTKVGRRLYTVRHRESGVERS I VCQI 
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SEQ 
ID 

NO: 


Predicted 

beginning 

nucleotide 

location 

cor re sponding 

to first 

amino acid 

residue of 

amino acid 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine / D^Aspartic Acid, E= 
Glutamic Acid, F=Phenyl alanine , G=Glycine, 
H=Histidine, I=Isoleucine, K= Lysine , 
L=Leucine, M=Methionine, N=Asparagine, 
p= Proline, Q=Glutamine, R=Arginine, 
S=Se r in e , T=Thr eonine , V=Va line, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *~Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 




- 




DTVEGS KKVT I RS P VQ I RNHFS V PLS VYEGDTLLGTAS PENE FN 
I PLGSYRS F I FLKF EDENYQMCEG I DFS E 1 1 KNDGALLKKKCRS 
KNPS KES FI*IN I VPEKDNLTSbSVYSEDGVfDLPYIMHLWP PILL 
RNLLPYKI AYY IEG I ENS VFTLSEGHSAQ I CTAQLGKARLHLKL 
LDYKSHDWKSEYH IKPNQQDI S FVS FTCVTEMEKTDLDIAVHMT 
YNTGO/IVVAFHSPyVfMVNKTGRMLQYKAJDGIHRKOT 
FS FQPNHFFNNNKVQLMVTDSELSNQFS IDTVGSHGAVKCKGLK 
MDYQ VGVT I DLS S FNI TRI VTFTP FYM I KNKS KYH I SVAEEGND 
KWLSLDLEQCIPFWPEYASSKLLIQVERSEDPPKRIYKNKQENC 
I LLRLDNELGG I IAEVNLAEHSTVITFLDYHDGAATFLLINHTK 
NELVQ YNQS S LS E I ED S LP PG KAVFYTWAD P VGS RRL KW R CRKS 
HGE VTOKDDMMMPIDLGEKTI YLVS FFEGLQRIILFTEDPRVFK 
VTYESEKAELAEQEI AVALQDVG I SLVNNYTKQEVAYIG I TSSD 
VVWETKPKKKARWKPMSVKOTEKLEREFKEYTESSPSEDKVIQL 
DTNVPVRLTPTGHNMKILQPHVIALRRNYLPALKVEYMTSAHQS 
S FRI QI YRI Q I QNQIHGAVFPFVFY PVKP PKS VTMDSAP KPFTD 
VS I VMRSAGHSQISR I KYFKVL I QEMDLRLDLGFI YALTDLMTE 
AEVTENTEVELFHKDIEAFKEEYKTASLVDQSQVSLYEYFHISP 
I KLHLS VSLSSGREEAKDS KQNGGL I PVHS LNLLLKS IGATLTD 
VQDWFKI^FELNYQFHTTSDI^SEVIRKYS KQAIKQM YVLIL 
GLDVLGNPFGLI REFSEGVEAF FYE P YQGAI QGPEEFVEGMALG 
L KAL VGGAVGG LAG AAS KI TGAMAKGVAAMTMDEDYQQKRREAM 
NKQPAGFREG I TRGGRGLVSGFVSG I TG XVTKP I KGAQKGGAAG 
FFKGVG KGL VGAVAR PTGG I XDMAS STFQG I KRATETS EVESLR 
PPRFFNEDGVIRPYRLRDGTGNQMLQKIQFYREWIMTHSSSSDD 
DDDDDDDDESDLNH 


6239 


2108 


634 


K PGMAGKG S SGRR PLLLGLL VAVATVHL.V I C P YTKVEES FNLQA j 
THDIXYHWQDLEQYDHLEFPGVVPRTFIX3PWIAVFSSPAVYVL } 

SLLEMS KFYSQL I VRGVLGLGVI FGLWTLQKEVRRHFGAMVATM 
FCWVTAMQFHIJ^FYCTRTLPNVIiALPVVLLAIiAAWLRKEW 
WLSAFAI I VFRVE LCLFLGLLLLLALGNRKVSVVRALRHAVPAG 
ILCLGLTVAVDS YFWRQLTWPEGKVLWYNTVLNKSSNWGTS PLL 
WYFYSALPRGLGCSLLFI PLGLVDRRTHAPTVLALGFKALYSLL 
PHKELRFI I YAFPMLNI TAARGCS YLLNNYKKSWLYKAGSLLVI 
GHLVVNAAYSATALYVSHFNYPGGVAMQRLHQLVPPQTDVLLHI 
DVAAAQTGVS R FLQVNSAWR YDKREDVQPGTGMLAYTH I LMEAA 
PGLLALYRDTHRVIJ^WGTTGVSLNLTQLPPFirVHIjCTKLVLL 
ERLPRPS 


6240 


2202 


1176 


HERGDSLKEPTS IAESSRHPSYRSEPSLEPESFRSPTFGKSFHF 
DPLSSGSRSSSLKSAQGTGF3LGQLQSIRSEGTTSTSYKSLANQ 
TRNGSLS YDS LLTPSDS PDF3S VQAG PE PDP PLC YTSPFLSARL 
AQQREAERHPRLVPTGPTHREPSPVRYDNLSRHIVASLQEREKL 
LRQSPPLPGREEEPGLGDSGIQSTPGSGHAPRTSSSSDDSKRSP 
LGKTPLGRPAVPRFGKPIX5LKv»RGVtf £» ffc.i'Vafc* 1 ArXUuunoZo 
SQKAQPGVSETEEVALQPLLTPKDEVQLKTTYSKSNGQPKSLGS 
ASPGPGQPPLSSPTRGGVKKVSGVGGTTYEISV 


6241 


3 


1341 


RNAEEKKRLSLQREKI IARVS I DNRTRALVQALRRTruP KLC IT 
RVEELTFHLLEFPEGKGVAVKERIIPYLLRLRQIKDETLQAAVR 
EILAL I G YVDP VKGRG IRILS IDGGGTRGWALQTLRKLVELTQ 
IGTVHQLFTDYICGVSrGAILAFMLGLFlIMPI^ECEELYRjajGSDV 
FSQNVI VGTVI04SWSHAFYDSQTWENILKDRMGSALMIETARNP 
TCPKVAAVSTIVNRGITPKAFVFRNYGHFPG INSHYLGGCQYKM 
WQAIRASSAAPGYFAEYAI^NDLHQIX^LLLNNPSALAMHECKC 
LWPDVPLECXVS LGTGRYBS DVRNTVTYTSLKTKLSNVINS ATD 
TEEVHI MLDGLLP PDTYFRFNPVMCEN I PLDESRNEKLDQLQLE 
GLKYI ERNEQ KM KKVAKI LSQEKTTLQKINDWI KLKTDMYEGLP 
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SEQ | 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corre spondi. ng 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A= Alanine, C-Cysteine, D=*Aspartic Acid, E= 
Glutamic Acid, F- Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=»Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, v=:Valine, 
W=TryDtonhan, Y= Tyrosine, X=Unknown , *=Stor> 
Codon, /sspossible nucleotide deletion, 
\=rpossible nucleotide insertion) 








FFSKIt 


6242 


198 


1310 

■ 


QHFLPGAETWS PGAAV CTARR F PGRS LAAFP RPAAP RRAV EMG E 
SSEDIDQMFSTl^EMDlrLTQSLGVDTLPPPDPNPPRAEFNYSV 
G FKJDIxNE S LNALET^QD LDAIjMJUJLVADI S EAEQRTI Q AC; KES LQ 

VIJ5LPLPPPPPEPI»SQEEEEAQAKADKJKIiALEKLKBAXVKKl»V 
VKNmMNDNSTKSI^DERQIJUU^VLD^PEKTHCIX^A^ 

KNPQNFYLDNRGKXESKETXreKMNAX*^ 
KDVCS I FKS FASEKKGKI 


6243 


1509 


614 - 


RSASRFSGCWSRDSTCCCCPSTCWSRSSASCPRARWPPSSAPAT 
TS RAS S RRLACG PQTRAGAETRS TAM I RANS AARDTRRAT CRSA 
AGTPS PTTMTC1/TDVPTG CAAVE PTARLPAAAWAST I TTG CCPA 
MGQ AGAG PAGRKG S EAGGG PGRAHHAHP S P L P R E P RVRTG ? PAH 
SPTPGSIDPS PEDSWGSAGVTQES PLLDPVDFLLFRTRAVDPLR 
RVFFFFYQHLTFFSIQPQPPPCHAFHPRDPPAGTKRQLII/VPIiK 

n YT * BTT OT 'POTT CBUCPVOBBCBThfVHnT C 


6244 


2119 


1745 


FEHAYASQFGTFLGNNESERCKLKLQQKTMSIiWSWVNQPSELSK 
FTNPL FEANWLVI WPS VAPQSLP LWEGI FL-RWNRS SKYLDEAYE 
EMVNI IEYNKELQAKVNILRRQIAELETEUGMQES P 


6245 


81 


1148 


LSLRNAKYSFPQEblSIiFSMTDIiNDNICKRYIKMITNIVILSLil 
I CI SLAFWI ISMTASTY YGNIiRPIS PWRWLFSVWPVLI VSNGL 
KKKSLDHSGAIXBGIiWGFI I/TIANFS FFTSLLMFFLSS SKLTKW 
KGEVKKRliDS E YKEGGQRNW V Q V FCBGAVPTfcijAijjji Kl 
EIPVDFS KQYSASWMC^SLLAAliACSAGDTWASEVGPVLSKSSP i 
RL I TTWE KVP VGTNGGVTWGIjVS S LIiGGTFVG I AYFLTQL I FV 
NDLDI SAPQWP 1 1 AFGGIiAGLLGSI VDS YU3ATMQYTGLDESTG 
MWNS PTNKARHIAGKP I LDNNAVN L. FS S VL IALIiL PTAAWG FW 
PRG 


6246 


1177 


359 

* 


SL^PVUIiMDDSI^QISLQLLCVYTANFPNGCSSLCWSSCGQHPV 
QAT HRG AVSNS LMLCI Ij KLAS QMPL ENTTVQQMVFMLLS NFLALiS 
HDCKGVlQKSNFIjQNFLSLAIiPKGGNKHIiSNIiTI LWDKLLLNI S 
SGETCOXJMILRIiDGOjDLLTEMSKYKHKSSPLLPIilFHNVCFS 
PANKPKI LANE KV I TVLAACLESENQNAQRIGAAALWAL I YTTYQ 
KAKTALKS PS VKRRVDEAYS LAKKTFPNS EANPIjNA YYXi KCLEN 
ItVQLLNSS 


6247 

• 

* 


3 


i c i a 


"uc xywrjfz DWTPD c jvfiQT .P PM A T? VATJRNS KKT /?IiVPL«TDDTSHAG D 
PGPGRALLECDHliRS GVPGGRRRKDWS CS LLVASLAGAFGS S Fir 
YG YNI*S VVNAPTP YI KAFYNES WERRHGRP I D PDTLTLLWS VTV 
SIFAIGGLVGTLIVKMIGKV1jGRKHT1>1iA^GFAISAAIjL^ 
MJAGAFEMLIVGRFIMGIDGGVALSVLPMYLSEISPKEIRGSIiG 
QVTAI FI CIGVFTGQLLGLPEI.LGKESTWPYLFGVI WPAWQL 
I^LPFLPDSPRYlxLLEKHNEARAVKAFQTFljGKAHVSQEVEEV^ 
AESRVQRS IRLVSVLEIiLRAPYVRWQVVTVI VTMACYQLCGLNA 

i wfytns i fgkag i p paki pyvtlstgg ietlaavfsglvl ehli 
grrplliggfglmglffgtltitltlodhapwvpylsivgiiai 

IASFCSGPGGIPFI LTGEFFQQSQRPAAFI I AGTVNWLSNFAVG 
IxLFPFIQKSlxDTYCFXVFATICITGAIYI*YFVl^ETK2JRTYAEI 
SQAFS KRNKAYPPEEKI DSAVTDGKINGRP 


6248 


5£ 


1773 


VPPPRMMAAVPPGLEPWNRVRI PKAGNRSAVTVQNPGAALDLCI 
AAVI KECHIjVI LS LKS QTLDAETD VIjCAVLYSNHNRMGRHKPHL 
ALKQVEQCLKRLKNMNIiEGS I QDLFELFSShlENQPLTTKVCVVP 
SQPVVEI,VU4KVLGACKLI*1jRI^ 1 
LNIiVMVGLVS RLWVLY KGVLKRL I LL YE PLFGLLQEVARIQPMP 
YFKDFTFPSDITEFLGQPYFEJ^KiCKMPIAFAAKGINKLliNKI,F 
LINEQS PRAS EETLLG I SKKAKQMKINVQNNVDLGQ PVKNKRVF 
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SEQ 
ID 

NO: 


Predicted 

beginning 

nucleotide 

location 

cor re spondi ng 

to first 

amino acid 

residue of 

amino acid 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine , C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Histidine, I=lsoleucine, X=Lysine, 
L= Leu cine, M=Methionine, N-Asparagine , 
P= Proline, Q^Glutaraine, R=Arginine, 
S=Serine , T=Threonine , V=Val ine , 
W= Tryptophan , Y=Tyrosine, X=* Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\spossible nucleotide insertion) 








KEESSEFDVRAFCNQIiKHKATQETSFDFKCSQSPJbKTTKYSSQK 
V I GTPHAKS FVQR FREAES FTQLS EE IQMAWWCRS KKLKAQAI 
FLGNKIiLKSNRIjKHLEAQGTSLPKKLECI KTS ICNHLLRGSGIK 
TSKHHLRQRRSQNKFLRRQRKPQRKLQSTLLREIQQFSQGTRKS 
ATDTSAKWRLSHCTVHRTDLY PNS KQLLNSGVSMPVI QTKEKMI 
HENLRG IHENETDS WTVMQ INKNSTSGTI KETDDIDDI FALiMGV 


© Aft z? 




1773 


VPPPRMMAAVPPGLEPWNRVRI P KAGNRS AVT VQN PGAALDLC I 
AAVIKECKLVILSLKSOTIJJAETDVIjCAVLYSNHNR^RHKPHL 
ALKQVEQCLKm.KNMNLEGSIQDLFELFSSNENQPLTTKVCVVP 
SQ P WE LVLMKVLGACKLLLRLLD CCCKTFLLTVKHLGLQEFI I 
I^VMVGLVSRLWVLYKGVLKRLILLYBPLFGLLQEVARIQPMP 
YFKDFTFPSDI TEFLGQPYFEAFKKKMP IAFAAKGINKTJ «NKLF 
L INEQS PRAS EETLLGI S KKAKQMKI NVQNNVDLGQ PVKNKRVF 
KEESSEFBVRAFCNQLtKHKATQETSFDFKCSQSRLKTTKYSSQK 
VI GTPHAKS FVQRFREAESFTQLS EE IQMAWWCRS KKLKAQAI 
FI^NKLLKSNRLKHLEAOGTS LP KKLECI KTS I CNHLLRGSG I K 
TS KHHIiRQRRSQNKFLRRQRKPQRKLQSTLLRE IQQFSQGTRKS 
ATDTSAKWRLSHCTVHRTDLYPNS KQLLNSGVSMPVI QTKEKMI 
HENLRG IHENETDS WTVMQINKNSTSGTIKETDDIDDI FALMGV 


6250 


232 


1306 


IAAIiHIMAiPFRKDLEKYKDLDEDELIXS^SETEI^QLETVLDD 
IJDPENALLPAGFRQKNQTSKSTTGPFDREIHLLSYLEKEALEHKD 
RED YVPYTGEKKGKI F I PKQKP VQTFTEEKVS LDPELEEALTSA 
^DTFLfDtJX A.T'L.GMHNLITNTKFCNI MGSSNGVDQEHFSNVVKG 
EKI LPVFDE P PN PTNVE ESLKRT KENDAHLVE VNLNN I KN I P I P 
TLKD FAKALETNTHVKC FS LAATRSNDPVATAFAEMLKVNKTLK 
SLNVESNFITGVGILAL IDALRDNETLAELKIDNQRQQLGTAVE 
LEMAKMIaEEin^ILKFGYQFTOXKyPRTRAANAITKNNDLVRKRR 
VEGDHQ 


6251 


62 


972 


TPGSGPMSAWAAASLSRAAARCLLARGPGVRAAPPRDPRPSHPE 
PRGCGAAPGRTLHFTAAVPAGHNKWS KVRHI KGPKDVERSRI FS 
KLCIjNIRLAVKEGGPNPEHNSNLANILEVCRSKHMPKSTIETAL 
KMEKS KDT YLLYEGRGPGGS SLL I EALSNS SHKCQAD I RH 1 LNK 
NGGVMAVGARHS FDKKGVI WEVEDREKKAVNLERALEMAI EAG 

aedvketedeeernvfkficdasslhqvrkkldslglcsvscal 
efi pns kvqlaepdleqaahliqalsnhedvihvydnie 


6252 

* 


27 


1897 


EEFCTWIAVRVGEMETAPKPGKDVPPKKDKLQTKRKKPRRYWEE 
ETVPTTAGASPGPPRNKKNRELRPQRPKMAYILKKSRISKKPQV 
PKKPREWKNPE S QRGLSGAQD PF PGPAPVPVE WQKFCRIDKSR 
KLPHSKAKTRSRLEVAEAEEEETSIKAARSELLLAEEPGFLEGE 
DGEDTAKICQADI VEAVDIASAAKHFDLNLRQFG PYRLNYSRTG 
RHLAFG GRRGHVAALDWVTKKLMCE I NVM EAVRD I RFLH SEALL 
AVAQNRWLHIYDNQGIELHCIRRCDRVTRLEFLPFHFLLATASE 
TGFLT YLDVS VGKI VAALNARAGRLDVMSQNP YNAVI HLGHSNG 
TVSLWSPAMKEPLAKILCHRGGVRAVAVDSTGTYMATSGLDHQL 
KI FDLRGT YQPLS TRTLPHGAGHLAFSQRGLLVAGMGDVVNI WA 
GQGKASPPSLEQPYLTHRLSGPVHGLQFCPFEDVLGVGHTGGIT 
SMLVPGAGEPNFDGLESNPYRSRKQRQEWEVKALLEKVPAEL I C 
LDPRALAEVDVI SLEQGKKEQIERLGYDPQAKAPFQPKPKQKGR 
SSTASLVKRKROCVMDEEHRDKVRQSI^Q^HHKEA^ 
ALDRFVR 


6253 


27 


1897 


EEFCTW I AVRVGEMETAP KPGKD VP P KKDKLQTKRKKPRR YWEE 
ETVPTTAGASPGPPRNKKNRELRPQRPKHAYILKKSRISKKPQV 
PKKPP^WKOTESQRGI^GAQDPFPGPAPVPVEWQKFCRIDKSR 
KLPHS KAKTRSRLEVAEAEEEETS I KAARS ELLLAE E PGFLEGE 
DGEDTAKI CQAD I VEAVD IASAAKHFDLNLRQFG PYRLNYSRTG 
RHLAFGGRRGHVAALDWVTKXLMCEINVMFAVRDIRFLHSEALL 
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ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Aciino acid segment containing signal peptide 
(A= Alanine , C=Cysteine, D^Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine # 
L^Leucine, M=Methionine , N=Asparagine, 
P=:Proline, 0=Glutamine, R^Arginine, 
S= Serine, T= Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Un)cnown, *=Stop 
Codon, /=pos3ible nucleotide deletion, 
\ -possible nucleotide insertion) 








AVAQNRWLH I YDNQG I ELtHCI RRCDRVTRLE FLP FHFIjLATASE 
TGFLTYLDVSVGKIVAAI2JARAGRLDVMSQNPYNAVIHLGHSNG 
T VS*L»WS PAMKEPIAKI LCHRGGVRAVAVDSTGTYMATSGLiDHQI* 
KI FDLRGTYQPLSTRTLPHGAGHLAFSQRGLLVAGMGDWNIWA 
GOGKASPPSLEtSPYLTHRI^GPVHGIiQFCPFEDVLGVGHTGGIT 
SMLVPGAGEPNFDGLESNPYRSRKQRQEWEVKALIjEKVPAETjIC 
LDPRAIAEVDVISLEC^KKEQIERIiGYDPQAKAPFQPKPKQKGR 
S STASLVK3UCRKVIWEEHRDKVRQSLQQQHHKEAKAKPTGARPS 
ALDRFVR 


6254 


155 


1139 


HAIX5RRGGSQEUSAAACGCFALRLRAPGSGRPAIAPGAAAFAGL 
GGAPRFPPRGSAAGRTMLLKEYRICMPLTVDEYXIGQLYMISKH 
SHEQSDRGEGVEWQWEPFEDPHHGNGQFTEKRVYLNS KLiPSWA 
RAWPKIFYVTEKAHNYYPYTITEYTCSFLPKFSIHIETKYEDN 
KGSNDTI FDNEAKDVEREVCFIDI ACDEI PERYYKESEDPKHFK 
S EKTGRGQLREG WRDS HQP IMCS Y10_»VTVKFEVWGLQTRVEQFV 
HKVVRDILLIGHRQAFAWVDEWYDPnT^DVR^YEKlMHEQTNI^ 
VCNQHSSPVDDI ESHAQTST 


6255 


1 


1444 


PTRPQQELLVSIATVI FVASQKALS VESKAVI KQQUESVSNGWT 
VYR I ARQASRMGNHDMAKSLYQSLLTQVAS KHFYFWLNS L.KEFS 
HAE Q CLTGLQ EENYS S ALS CIAES LKFYHKG IAS LTAASTPLNP 
1£ FQCEFVKIiRI DLIjQAFSQLI CTCNSLKTSP P PAXATT I AMTIj 
GNDLQRCGR I SNQMKQSMEEFRSLASRYGDLYQAS FDADS ATLR 
NVEliQQQS CLLI SHAI E ALIUDPESAS PQEYGS TGTAHADS EYE 
RRMMSVYNHVLEEVESLNGKYTPVSYMHTACLCNAIIALIjKVPL 
SFQRYFFQBCLQSTS IKLALSPSPRNPAEP IAVQNNQQIiALKVEG 
WQHGSKPGLFRKIQSVCLNVSSTLQSKSGQDYKIPIDNMTNEM 
EQRVEPHNDYFSTQFLLNFAI LGTHN I TVESSVKDANG I VWKTG 
PRlTIFVKSLEDPYSC^IRI^CX}AQQPbGXX2QQRNAYTRF 


6256 


1 


1542 


CRGAGAEPAANPRSPRSLVPSIiESTSTSVPPAPGTMATDSWAIA 
VDEQEAAAES LSNIiHLKEEK I KPDTNGAWKTNANAEKTDEEE K 
EDRAAQSLLNKLIRSNLVDNTNQVEV1X3RDPNSPLYSVKSFEEI* 
RLKPQIJ^VYAMGFNRPSKIQENALPLMLAEPPQNLIAQSQSG 
TGKTAAFVLAMLSQVEPANXYPQCLCXiSPT^LAL EQM 
GKF Y P EL KLAYA VRGNKLE RGQK I SEQIVIGTPGTVLiDWCSKIjK 
FIDPKKIKVFVLDEADVMIATQGHQDQSIRIQRMLPRNCQMLLF 
SATFEDS VWKFAQKWPDPNVT KLKRE EETLDT I KQ YYVTjCS S R 
D EKFQAliCNL YG AI T IAQ AM I FCHTR KT AS WLiAAELS KEGHQ VA 
LI^GEMMVEQRAAVIERFREGKEKVLVTTNVCARGIDVBQVSVV 
INFDLPVDKDGNPDNETYLJIRIGRTGRFGKRGLAVNMVDSKHSM 
NILNRIQEHFNKKIERLDTDDLDEIEKIAN 


6257 


210 


615 


AFIPAMAELIQKKLQGEVEKYQQLQKDLSKSMSGRQKLEAQIiTE 
NNITVKEEIALLJX5SNVVFKLLGPV^ 

TAEIKRYESQLRDLERQSEQQRETLAQLiQQEFQRAQAAKAGAPG 
KA ! 


6258 


210 


615 


AFI PAMAE L I QKKLQGEVEKYQQLQKDI»S KS MSGRQKLEAQI/TE 
I^r\^EIJU J LIX3SNWFKIiI/3PVLVKQEU?EAPATVGiaiI,DYI 
TAEI KRYESQIiRDLERQSEQQRETIAQLQQEFQRAQAAKAGAPG 
KA 


6259 


2 


154 0 


IL£KGFPSQCHPERKWKVDDVX.ESSQENEDDHFWEI.LFHNNKTV 
SVENGDRGSKTFNLGTDPVSLRNYPYKI CDS CEMNLKNT SGL 1 1 
SKKNCSRKKPDEFWCEKLLLDIRHEKIPIGEKSYKYDQKRNAI 
NYHQDLSQPS FGQS FE YS KNGQG FHDEAAFFTNKRSQ I GETVCK 
YNECGRTFIESLKLNI SQRPHLEMEPYGCS ICGKSFCMNLrRFGH 
QRALTKDNP YEYNE YGEI FCDNS AFI IHQGAYTRKILREYKVSD 
KTWEKSALLKHQIVHMGGKSYDYNENGSNFSKKSHliTQLRRAHT 
GEKTFECGECGKTFWEKSNIiTQHQRTHTGEKPYECTECGKAFCQ 
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SEQ 
NO: 


Predicted 

V\^a 1 1 \ ft i t>ci 

l^^vj A Nil A-.*^*^! 

nucleotide 
location 
corresponding 
to first 
amino acid 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
<A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine , G=Glycine, 
H=Ristidine, I=Isoleucine, K=Lysine, 
L= Leucine , M-Methionine, N=»Asparagine, 
P= Proline, Q=Glutamine, R=Arginine, 
S=Serine f T=Threonine, V= Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 


residue of 
amino acid 
sequence 








KPHLiTNHQRTHTGE KPYECKQCGKTFCVKSNlrTEHQRTHTGEKP 
YECNAOGKS FCHRSALTVHQRTHTGEKPFI CNECGKS FCVXSNL 
IVHQRTHTGEKPYKCNECGKTFCEKSALTKHQRTHTGEKPYECN 
ACGKTFSQRS VLTKHQR I HTRVKALSTS 


6260 


2081 


1436 


GTGPEIHAOU^ASARAPGSRAMALREIjKVCIiI^DTGVGKSSIVW 
RF7EDSFDPNINPTIGASFMTKTVQYQNELHKFLIWDTAGQERF 
RALAPWYYRGSAAAI I VYDITKEETFSTLKNWVKELRQHGPPNl 
WAI AGNKCDL I DVRE VMERDAKD YADS IHAI FVETS A KNAIN I 
NELFIEISRRIPSTDANLPSGGKGFKLRRQPSEPKRSCC 


6261 


3 


1188 


FWYRIiGPGTRSRWPRRGSWAASLVPRGPSPAAIiVTSPCPPDPLR 
SPACEPCRPDFAPRPAIjIjIjRSGPRSAPAVTGKPALKGQPGPWPG 

maevs idqs klpgvkevcrdfavle dhtlars lqeqe i eh h las 
nvqrnrlvqhdu3vakqlqeedlkaqaqlqkrykdleqqdceia ! 

qei qekla i eaerrriqekkdedi arllqekelqeekkrkkhfp 
efpatrayadsyyyedggmkprvmkeavstpsrmahrdqewyda 
eiar klqeee li*atqvdmraaqvaqde e iarllmaeekxaykka 
kerekssldkrkqdpewkpktakaanskskesdephhsknerpa 
rppppimtdgedadythftnqqsstrhfsksesshkgfhykh 


6262 


2 


1759 


PECHSCHSLCSVHRPGKVPQARI^GLVLGQRDEPAGHRLSQEEIIi 
a. ci tp t .v<?oc;t .f at .r RHOA VLOST »SOTI EC1iOGGGHEEG1»VHEK 
ARQIiRRSMENI ELG LS EAQVMIJUxA^ HLSTVES E KQKIiRAQ VRR 
I^ENQWLRDEI^TG^RI^RSEQAVAQLEEEKKHI^FLGQLRQ 
YDEDGHTSEEKEGDATKDSLDDLFPNEEEEDPSNGLSRGQGATA 
AQQGG YE I PARLRTLHNI> V I Q YAAQGR YEVAVP LCKQALEDLiER 
TSGRGHPDVATMI^ILAl*VYRDQNKY'KEAAHI*liNDALS IRESTIi 
GPDH PAVAATLNNI^VL YGKRGKYKEAEPLCQRAIjEIRE kvlgt 
NHPDVAKQLNNliAIJJCQNCXSKYEAVERYYOJ^AIAI YEGQLGPDN 
PNYARTKNNLASCYLKQGKYAEAETLYKEILTRAHVQEFGSVDD 

dhkpiwmhaeereemsksrhheggtpyaeyggwykackvssptv 

fTTTLRNLGALYRROGKLFJlAE^ 

EIiLGESDG^TSQEGPGDSVKFEGGEDASVAVEWSGDGSGTLQR 
SGSLGKIRDVLRR j 


6263 


1 


2408 


RE LDS LADLPER I KP P YANGLSTSHIjRS S S VED VKLiI I S EGRPT 
IEVRRCSMPSVICEHTKQFQTISBESNQGSLI*TVPGDTSPSPKP 
EVFSNVPERDLSNVSNIHSSFATSPTGASNSKYVSADRNL I KNT 
AP VNTVMDS PVHI*E PS SQ VGVI QNKSWEMP VDRLETLSTRD FI C 
PWSNIPIX3ESSDQSFCNSENKVLKENADFLSLRQTELPGNSCAQ 
DPASFMPPQQPCSFPSQSLSDAES1SKHMSLSYVANQEPGILQQ 
KNAVQI ISSAJ^TDNESTKDTENTFVTjGDVQKTDAFVP VYSDST 
IQEAS PNFEKAYTLPVLPSEKDFNGSDASTQLNTHYAFSKLTYK 
SSSGHEVENSTTDTQVI SHEKENKLESLVLTHLSRCDSDLCEMN 
AGMPKGNLNEQDPKHCPESEKCLJjSIEDEESQQS ilsslbnhsq 
QSTQPEMHKYGQLVKVEIjEENAEDDKTENQI pqrmtrnkantma 
NQSKQIIiASCTLI^SEKDSESSSPRGRIRLTEDDDPQIHHPRKRK 
VSRVPQPVQVS PS LLQAKEKTQQSIAAI VDSLKLDEIQP YSSER 
ANP YFE Y1»HI RKKIEEKRKLLCS V I PQAPQYYDEYVTFNGSYUL 
IX^I^KICIPTITPPPSLSDPLKBLFRCXiEVVRMKIJlIjQHSIE 
REKLIVSNEQEVLRVHYRAARTLANQTLP 

PLDSQ S DDS KTS VRIJRFNARQFMSWLQDVDDKFDKIiKTCIiIjMRQ 
QHEAAAIJJAVQRLEWQLKLQEIjDPAT Y KS IS I YEI QEF YVPLVD 
VNDDFEI/TPI 


6264 


143 


1960 


khrqeionaldmapeihotgpmclientngelvanpealkilsai 
tqpvvwarvglyrtgksyiimnklagknkgfslgstvkshtkgi 

WMWCVPHPKKPEliTLVLLDTEGLGDVKKGDNQNDS WI FTLAVLI* 
S STL VYNS MGT INQQAMDQ L Y YVTEI/THR I RS KS S PDENENEDS 
ADFVS FFPDFVWTliRDFS LDLEADGQPIjTPDEYLEYSI*KLTQGT 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corxe spond i ng 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H-Histidine, I=Isoleucine , K=Lysine, 
h- Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine , V=Valine, 
W=» Tryptophan, Y=Tyrosine, X=UnJcnown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








SQKDKNFNLPRLCIRKFFPKKKCFVPDLPIHRRKLAQLEICLQDE 
ELDPEFVQQVADFCS Y I FSNS KTKTLSGG I KVNGPRLESLVLTY 
I NAI S RGD L PCMENAVIiALAQ I EN S AAVQ KAI AHYDQQMG QKVQ 1 
LPAETLQEI*LDLHRVS EREATEVYMKNS FKDVDHLFQKKLAAQL 
DKKRDDFCKQNQEAS SDRCSALLQV I FS PLEEEVKAG I YSKPGG 
YCLFIQKLQDLEKKYYEEPRKGIQAEEILQTYLKSKESVTDAIL 

QTDQ I LTE KEKE I EVECVKAES AQAS AKMVEEMQ I KYQQMMEE K 
EKSYQEHVXQLTEKMERERAQLLEEQEKTLTSKXQEQARVLKER 
CQGESTQLQNE IQKLQ KTLKXKTKR YMS HKLKI 


6265 


143 


1960 


KHRQENNALDMAPE I HMTG PMCLI ENTNGE LVANPEALKI LS AI 
TQPVWVAI VGLYRTGKS YLMNKLAGKNKG FSLGSTVKSHTKGI 
VTOWCVPHPKKPEHTLVIjLDTEGI^DVKKGDNQNDSW I FTLAVLL 
SSTLVYNSMGTII^QAMIXSIiYYVTELTHRIRSKSSPDENENEDS 
ADFVSFFPDFVWTLRDFSIiDLEADGQPLTPDEYIiEYSLKLTQGT 
SQKDKNFNLPRLCI RKFFPKKKCFVFDLP IKRRKLAQLEKLQDE 
ELDPEFVQQVADFCS Y I FSNSKTKTLSGG I KVNGPRLES LVLTY 
INAISRGDLPCMENAVLALAQIENSAAVQKAIAHYDQQNaGQKVQ 

LPAETLQELLDLHRVSEREATEVYMKN S FKDVDHLFQKKLAAQL 
DKKRDDFCKQNQEAS SDRCS ALLQVI FS PLEEEVKAGI YSKPGG 
Y CLFI QKLQDLEKKYYEEPRKG IQAEE I LQTYLKSKE S VTDAIL 
QTDQILTEKEKEIEVECVKAESAQASAKMVEEWQIKYQQMMEEK 
EKS YQEHVKQLTEKMERERAQLLEEQEKTLTS KLQEQARVLKE R 
CQGESTQLQNEIQKLQKTLKKKTKRYMSHKLKI 


6266 


276 


1421 


GSkOKQMLVPCFLYSLQNRKPSLYGSLTCQGIGLDGIPEVTASE 
GFTVNEINKKS IHISCPKENASS KFLAPYTTFSRIHTKS ITCLD 
I S S RGGLG VSS S TDGTMKI WQASNGELRRVLEGHVFDVN CCRFF 
PSGLWLSGGMDAQLKI WSAEDASCWTFKGHKGGI LDTAI VDR 
GRNVVS ASRDGTARLV7DCGRS ACLG VLADCG S S INGVAVGAADN 
SINIjGSPEQMPSEREVGTEAKMI^LlAREDKKLQCLGLQSRQL.VF 
LFIGSDAFNCCTFLSGFLLIiAGTQDGN I YQLDVRSPRAPVQVI H 
RSGAPVDSLLSVRDGFIASQGDGSCFrVQQDLDYVTELTGADCD 

P VYKVATWE KQ I YTCCRDGLVRR YQLSDL 


'62*7 


3 


r 622 


"lgmmkknnsakrgpqdgnqqpappekvgwvrkfcgkg i fre i wk 
nryvvlkgdqlyisekevkdekniqevfdlsdyekceelrksks 
rskknhskftlahskqpgntapnli flavspeekes w inalnsa 

ITRAKNRILDEVTVEEDSYLAHPTRDRAKIQHSRRPPTRGHIjMA 
VAS TSTSDGMLTLDLIQEEDPSPEEPTSLC 


6268 


160 


1368 


~HREI>CQ1MLPAGLSSALIDNPLTLLI>SIDTYVMLQEPVTFQDVAV 
DFSREEWGLI^PTQRTEYRDVMLETFGHLVSVGWETTLENKELA 
PNSDI PEEEPAPSLKVQESSRDCALSSTLEDTLQGGVQEVQDTV 
LKQMESAQEKDLPQKKHFDNRESQANSGALDTNQVSLQKIDNPE 

S QANSG ALDTN QVLLHKI P P R KRLR KRDS Q VKS MKHNSR VKIH Q 
KSCERQKAKEGNGC31KTFSRSTKQITFIRIHKGSQVCRCSECGK 
I FRNPRYFSVHKKIHTGERP YVCQDCGKGFVQS SSLTQHQRVHS 
GERPFECQECGRTFNDRSAX£>UH1»RTH rva/UUr I is.^~wu\^\y jvrtr k,w 
S S HLIRHQRTHTGERPYACNKCGKAFTQS SHL I GHQRTHNRTKR 
KKKQPTS 


6269 


2886 


1449 


HASAPTRRNMAAASPLRDCHAWKDARLPLSTTSNEACKL.FDATL 
TQ YVKWTNDKS LGG I EG CLS KLKAADPTFVMGHAMATGLVL I GT 
GSSVKLDKELDLAVKTMVEISRTQPLTRREQLHVSAVETFANGN 
FPKACELWEQILQDHPTDMLALKFSHDAYFYLGYQEQMRDSVAR 
IYPFWTPDIPLSSYVKGIYSFGLMETNFYDQAEKLAKEALSINP 

TDAWS VHTVAH IHEMKAE I KDGLEFMQHSETLWKDSDMLACHNY 
WKWALYLIEKGEYEAALTIYDTHILPSU2ANDAMLDVVDSCSML 
YRLQMEGVSVGQRWQDVLPVARKHSRDHILLFNDAHFLMASIiGA 
HDPQTTQELLTTLRDAS ES PG ENCQHLLARDVGLPLCQALVEAE 
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ID 
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amino acid 
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amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
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amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine , C= Cysteine , D=Aspartic Acid, E» 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K= Lysine, 
L=Leucine, M= Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V= Valine, 
W=Tryptophan , Y= Tyro sine, X»Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








DGNPDRVLELLLP I RYRIVQLGGSNAQRDVFNQLLIHAALNCTS 
SVHICNVARSLLMERDALKPNS PLTERLI RKAATVHLMQ j 


6270 


23 


2086 


S VTVTLGSEGDGRPPT YHLEEMEQEPQNGEPAE1 KI IREAYKKA 
FL,FVNKGLNTDELGQKEEAKNYYXQGIGHIiLRG IS I SSKESEHT 
GPGWESARQMQQKMKETI^NVRTRLEILEKGLATSI^IDLQEVP 
KLYPEFPPKDMCEKLPEPQSFSSAPQHAEVNGNTSTPSAGAVAA 
PAS LS LPS QS CP AEAP PAYTPQAAEGHYTVS YGTDSGEFS S VGE 
EFYRNHSQPPPLETLGLDADELILIPNGVQIFFVNPAGBVSAPS 
YPGYLRIVRFLDNSLDTVLNRPPGFLQVCDWLYPLVPDRSPVLK 
CTAGAYM FPDTMLQAAGCFVGWLSS ELPEDDRELFEDLLRQMS 
DLRIjQANWNRAEEENEFQIPGRTRPSSDQLKEASGTDVKQLJDQG 
NKDVRHKGKRG KRAKDTS SEEVNLSHI VP CEPVPEE KP KELPEW 
SEKVAHNILSGASWVSWGLVKGAEITGKAIQKGASKLRERIQPE 
E KP VEVS P AVTKGL Y IAKQATGGAAKVSQFLVDGVCTVANCVGK 
ELAPHVKKHGSKLVPESIiKKDKDGKSPIJ)GAMWAASSVQGFST 
VWO^LECAAKCIVNNVSAETVQTVRYKYGYNAGBATHHAVDSAV 
NVGVTAYN I NNIG I KAMVKKTATQTGHTLLEDYQI VDNSQRENQ 
EGAANVNVRGEKDEQTKEVKEAKKKDK 


6271 


32 


1058 


GCG VKTAGMVG REKELS I HFVPGS CRLVEEBVNI PNRRVLVTGA 
TGLLGRAVHKE FQQNNWHAVGCG FRRARP K FE Q VNLLD SNAVHH 
I IHDPQPHVI VHCAAERRPDWENQPDAASQLNVDASGNLAKEA 
AAVGAFLI Y I S SDYVFDGTNPP YR RED I PAPLNLYGKT KLDG EK 
AVLENNLGAAVLRI P I LYGEVEKLEES AVT VM FDKVQFSNKSAN 
MDHWQQRFPTHVKDVATVCRQbAEKRMLDPSIKGTFHWSGNEQM 
TKYEMACAIADAFNLPSSHLRPITDS PVLGAQRPRNAQLDCSKL 
ETLGIGQRTPFRIGIKESLMPFLIDKRWRQTVFH 


6272 


1136 


528 


G AVMEDAAAP GRTEGVIiERQGAP PAAGQGGALVELTPT PGGLAL 
VS P YHTHRAGD PLDLVALAEQVQKADEFI RANATNKLTVI AEQ I 
QHLQEQ ARKVLEDAHRDANLHHVACN I VKKPGNI Y YLYKRE5 GQ 
QYFS I ISPKEWGTSCPHDFLGAYKLQHDLSWTP YEDIEKQDAKI 
SMMDTIiLSQSVALPPCTEPNFQGLTH 


62 73 


256 


843 


SCPRVSPECRSLGCQVMFSLPJLNCSPDHIRRGSCWGRPQDLXIA 
SAAWNS KCHPG AG AAMARQHARTLW YDRPRYVFME FCVEDSTDV 
HVLIEDHRIVFSdKNADGVELYNEIEFYAKVNSKDSQDKRSSRS 
ITC FVRKWKEKVAW PRLTKEDI KP VWLS VDFDNWRDWEGDEEME 
LAHVEHYAEVRDNTYCVLPT 


6274 


56 


1142 


AAAAMAAAAGGGAGAARSLSRFRGCLAGALLGDCVGSFYEAHDT 
VDLTS VLRHVQ S L E P DPGT PGS ERTEAL Y YTD DTAMARALVQS L 
IAKEA FDE VT3MAH RFAQ EYKKD PDRG YGAGVVTVT KKLLNP KCR 
DVFE PARAQFNGKGS YGNGGAMRVAG I SLAYS SVQDVQKFARLS 
AQLTHAS SliG YKGAI LQ AIiAVHIJ^jC^E S SS KHFLKQI*LGHM 
LEGDAQSVXJ5ARELGMEERPYSSRLKKIGELLDQASVTRBEVVS 
E LGNG I AAFES VPTAI Y CFLRCMEPDPE I PSAFNSLQRTLI YS I 
S LGGDTDT I ATMAGAIAGAY YGMDQVPES WQQS CEG YEETD I LA 
QSLHRVFQKS 


6275 


20 


565 


SRRGRARCLARGSRRP VPRPAKTMAFM VKTM VGGQLKNLTGS LG 
GGEDKGDGDKSAAEAQGMSREEY^EYQKQLVEEKWERnAQFTQR 
KAERATLRSHFRDKYRLPKNETDESQIQMAGGDVELPRELAKMI 
EEDTEEEEEKASVLGQl^LPGLNLGSLKDKAQATLGDLKQSAE 
KCHVM 


6276 


797 


97 


TLLPLPP LPDT EGM I LLNTGLEGT VAENP VP I VHT PSGNI LTLE ! 
SCUWIATHPGHWGIHLQIAEPAALRPSIJU^LARI^SI^LIiHW 
VWVGAKI SHGSFS VPGHVAGREIJ/TAVAEWPHVTVAPGWPEEV 
LGSGYREQI^TDMLELCO^LWQPVSFQMOAMLLGHSTAGAIGRL 
LASS PRATVTVEHN PAGGD YAS VRTALLAARAVDRTRVYYRLPQ 
GYHKDLLAHVGRN 
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sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C- Cysteine , D=Aspartic Acid, E- 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=I>eucine f M^Methionine, N=Asparagine , 
P- Proline, Q=Glutamine, Rs=Arginine, 
S=Serine, T=Threonine, V= Valine, 
W= Tryptophan , Y= Tyros ine , X-Unknown , * » S t op 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 


6277 


4600 


2744 


MAFRTEMGLYYSyPKTlVEAPSFLNGVWMIMNDKLTEYPIiVINT 
LKRFNLYPEVILASWYKIYT^MDLIGICrrKIOrrVTIGEGL.SP 
TE S CEGDG D PACFYVAV I F I LNGItMMALiFF I YGTYuSGSRIjGGL 
VTVLCFFFNHGECTRVlWTPPIjRESFSYPFLVLQMIiVTHII^ 
TKiYRGSLIALCISNVFFMLPWQFAQFVLLTQIASLFAVYVVGY 
IDICKLRKI IYIHMIS LALC FVLMFGNSMLLTS YYASS LV I IWG 
IIJU^HFLKINVSEXSLWVIC^CFWIiFGTVILKYLTS KI FGIA 
NDAHI GNLLTS K5*FS YKDFDTL»L»YT CAAEFDFMEK3TFLR YTKT 
LliLPWLVGFVAIVRKI ISDMWGVIAKOOTHVRKHQFDHGELVY 
HALQLIAYTAIiG I LIMRI>KLFLTPHMCVMASLI CSRQLFGWLFC 
KVHPGA I VFAJ I*AAMS I QGS ANLiQTQWNTVGBFSNI*PQEEIj I EW 
IKYSTKPDAVFAGAMPTMASVXI^ALRPIVNHPHYEDAGLRART 
KTV^S^SRKAAEEVKRELIlOiKVNYYIIjEESWCVlUlSKPGCSM 
PEIWDVEDPANAGKXPLCNLLVKDSKPHFTTVFQNSVYXVLEVV 
KB 


6278 


3 


823 


HjFRIiVLLSLVYLLNSVATEERKPABVUVEGQQYAWGTVLLL 
I RI I LE YCQGVDN I P S VTTDMLTRLSDLIjKYFN SRS CQLVLGAG 
AIjQVVGLKTITTIGJIJ\IjSSRCIjQIjIVHYIPVT 
yspilrhfdhitkdyhdhxaei SAKLVAIMDSLFDKXLSKYEVKA 
PVPSACFRNICKQMTKMHEAI FDLrLPEEQTQMLFXRINASYKLH 
LKKQI^HLNVINDGGPONGLVTADVAFYTGNLQAIjKGLKDIiDLN 
MAEIWEQXR 


6279 


127 


1687 


GGAMASIX3ARKQFWKRSNSKLPGSIQHVYGAQHPPFDPLLHGTI, 
LRSTAKMPTTPVKAKRVSTFQEFESOTSDAWDAGEDDDEXiLAMA 
AESLNS EVVMETAWRVLRNHS QRCjGRPTLQEGPGLQQKPRPEAE 
PPSPPSGDLRIiVKSVSESHTSCPAESASDAAPLQRSQSIiPHSAT 
VTLGGTSD PSTI>S S SAI>SEREASRIiDKFKQLIjAGPNTDLEEI*RR 
LS WSG I PKPVRPMT WKLLSGYIiPANVDRRPATLQRKQKSYFAFl 
EHYYDSRNDSVHQDT YRQIH I D I PRMS PEALI LQPKVTE 1 FER Z 
LFIWAIRHPASGYVQGINDLVTPFFWFICEYIEAEEVDTVDVS 
G VP AE VL CN I EADT YW CMS KLi IiDG I QDNYTFAQ PG I QMK VKMLE 
ELVSR IDEQVHRKIJ^HEVRYliQFAFRWMNin^liMREWI^CTI R 
LWDT^QSEPDGFSHFHLYVCAAFliVRWRKEIliEEKDFQELIXFI, 
QNIjPTAHWDDEDISLLUAEAYRLKFAFADAPNHYKK 


6280 


857 

• 


2515 


ECCDQKMGSRNSSSAGSGSGDPSEGLiPRRGAGL»RKh»iiliJitii5liUK 
DVDLAQVIAYIiLRRGQVRLVQGGGAANI^FIQALIiDSEEE3roRA 
WDGRI^DRYNPPVDATPDTRELEFNEIKT^VEIJVTGQliGl^RAA 

QKHSFPRMLHQRERGI*CHRGSFSLGEQSRVI SHFLPNDLGFTDS 
YSQKAFCG I YSKDGQI FMSACQDQTI RLYDCRYGRFRXFKS I KA 
RDVGWSVLDVAFTPDGNHFLYSSWSDYIHICNIYGEGDTHTALD 
LRPDERRFAVFSIAVSSDGREVLGGANIX3CLYWDREQNRRTLQ 

I ESH EDDVNAVAFAD I S S Q I LFSGGDDAICKVWDRRTMREDDPK 
PVGAL1AGHQDGITFIDSKGDARYI.ISNSKDQTIKLWDIRRFSSR 
EGMFJ^ROAATOONVroYRWOOVPKKAWRKLKIjPGDSS LMTYRGH 
GVIiHTLIRCRFSPIHSTCQQFIYSGCSTCKVVVYDIjI^GHIVKK 
LT^IHKACVRJ)VSWHPFEEKIVSSSWDGNLRLWQYRQAEYFQDDM 
PESEECASAPAPVPQSSTPFSSPQ 


6281 


857 


2515 


ECCDQKMGSRNSSSAGSGSGDPSEGLPRRGAGLRRSEEBEEEDE : 

DVDLiAQVtAYLU^PJGQVRLVQGGGAAm>QFI07u J X.DSEEENDRA 

WDGRLGDRYNPPVDATPDTRELEFNEIKTQVELATGQLGIiRRAA 

QKHSFPRMI>HQRERGLCHRGSFSIiGEQSRVISHFIiPNDLGrrDS 

YSQKAFCG I YS KDGQ I FMS ACQDQT I RL YDCR YGRFRKFKS I KA 

RDVGWSVI^VAFTPEXjNHFIjYSSWSBYIHICNIYGEGDTHTALD 

LRPDERRFAVFS IAVSSDGREVLjGGANDGCLYVFDREQNRRTLQ 

iesheddvnavafadissqilfsg<;ddaickvwdrrtmreddpk 
pvg ai»aghqdg i tfi ds kgdaryxii sns kdqt i klwdirrfs s r 
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ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D^Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine , K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P= Proline, Q=Glu taurine , R«Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=» Tyrosine , X=Un3cnown , *«=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








EGMEASRQAATQQNWDYRWCXJVPICKAWRKLKJbPGDSSIiMTyRGH 
GVLHTLIRCRFSPIHSTGQQFI YSGCSTGKWVY DLLSGHI VKK 
LTNH KACVRDVSWHP FEEKI VS SSl^nXSNIiRLWQYRQAE YFQDDM 
PESEECASAPAPVPQS STPFSSPQ 


6282 


12S 


906 


RMAACRALKAVLVDLSGTLHI EDAAVPGAQEAL KRLRGAS V 1 1 R 
FVTNTTKES KQDLLE RLRKJLE F D I SEDE I FTSIiTAARS LliERKQ 
VR PMbLVDDRALPDFKG I QTSDPNAVVMGLAP EHFHYQ I LNQAF 
RliLLDGAPIilAIHKARYYKRKIXSLAXiGPGPFVTALEYATDTKAT 
WG KPEKTF FLEALRGTGCEPEEAVMIGDDCRDDVGGAQDVGML 
GILVKTGKYRASDEEKINPPPYLTCESFPHAVDHILQHLL 


6283 


140 


1043 


LSLFGIHVMNPFWSMSTSSVRKRSEGEEKTLTGDVKTSPPRTAP 
KKQLPS I PKNALPITKPTS PAPAAQSTNGTHAS YGPFYI*E YSLL 
AEFTL WKQKL PG VYVQ PS YRSALMW FGVI F IRHGLYQDGVFKF 
TVYI PDNYPDGDCPRLVFDIPVFHPLVDPTSGEIiDVKRAFAKWR 
RJtmNHIWQVLMYARRVFYKIDTASPLNPEAAVl»YEKDIQLFKSK 
WDSVKVCTARLFDQPKI EDPYAIS FSPWNPSVHDEAREKMLTQ 
KKKPEEQHNKSVHVAGLSWVKPGSVQPFSKEEKTVAT 


6284 


1 


2879 


RSVI PGSTISSRWPGLSRPRFMAAHEWDWFQREEIjIGQISDIRV 
QNLQ VERENVQ KRTFTR W I NL»HI>E KCN P PLEVKDLi FVD I QDG KI 
LP4AIJjEVI*SGRNI»LHEYlCSSSHRIFRXiWIAKAI*KFI^DSNVKL 
VSIDAAEIADGNPSLVLGLIWNIILFFQIKELTGNLSRNSPSSS 
IiAPGSGGTDSDSSFPPTPTAERSVA I SVKDQRKAI KALI*AWVQR 
KTRKYGVAVQDFAGS WRSGIAFIiAV I KAIDPSLVDMKQALENST 
RENLEKAFSIAQDALHIPR I J.EPEDIMVDTPDEQS IMTYVAQFL 
ERFPELEARDI FDSDKEVP I ESTFVR I KETPSEQ ESKVFVX.TEN 
GERTYTVNHETSHP PPS KVWCDKPESMKEFRLDGVSSHALSDS 
STEFMHQI I DQVI*QGGPGKTSDI SEPSPESSILS SRKENGRSNS 
LPIKKTVHFFJUOTYKDPFC^KNI^LCFEGSPRVAKESLRQDGHV 
IAVEVAEEKEQKQESSKIPESSSDKVAGDIFIiVEGTNNNSQSSS 
CNGALESTARHDEESHSLSPPGENTVMADSFQIKVNLMTVEALE 
EGDYFEAI PLKASKFNSDLIDFASTSQAFNKVPS PHETKPDEDA 
EAFENHAEKLGKRS IKSAHKKKDS PEPQVKMDKHEPHQDSGEEA 
EGCPSAPEETP VDKXPEVHE KAKRKSTRPHYEEEGEDDDLQG VG 
EELSSSPPSSCVSLETLGSHSEEGLDFKPSPPIiSKVSVIPHDLF 
YFPHYEVPLAAVLEAYVEDPEDLKNEEMDLEEPEGYMPDLDSRE 
EEADGSQSSSSSSVPGESLPSASDQVLYLSRGGVGTTPASEPAP 
I*APHEDHQQRETKENDPMDSHQSQESPN1.ENIANPIjEENVTKES 
ISSKKKEKRKHVDHVESSLFVAPGSVQSSDDLEEDSSDYS I PSR 
TSHSDSS I YLRRHTHRSSESDHFSLCSVEERSRSG 


6285 


2157 


1331 


S CKT ENLLEMWWFQQG hS FLPS ALVI WTS AAFI FS Y I TAVTliHII 
ID PALP Y I SDTGTVAP E KCI*FG AMLNI AAVLCIAT I YVRYKQ VH 
AL.S PEENVI I KLNKAGL VLG I LS CIGLS I VANFQ KTTL FAAH V S 
GAVLTFGMGSLYMFVQTILSYQMQPKIHGKQVFWIRIjLLVIWCG 
VSAI^MLTCSSVrJiSGNFGTDL.EQKLHWNPEDKGYVLHMITTAA 
EWSMS FS FFGFFLT Y I RDFQKI SLRVEANIiHGL»TI*YDTAPCP IN 
NERTRLLSRDI 


6286 


1619 


276 


KAGASCXGSANPWSVGKSCVLLAMAQLQTRFYTDNKKYAVDDV 
PFSIPAASEIADLSNI INKLL.KDKNEFHKHVEFDFLIKGQFLRM 
PLDKHMEMENI SS BE WE IE YVEKYTAPQPEQCM FHDD W I SS I K 
GAEEWILTGS YDKTSRIWSLEGKS IMT I VGHTD WKDVAWVKKD 
SLS CLLLSASMDQTI LLWEWNVERN KVKALHCCRGHAGSVDS IA 
VDGSGTKFCSGSWDKMLKIWSTVPTDEEDEMEESTNRPRKKQKT 
EQLGI*TRTPI VTLSGHMEAVSS VLWS DAEE I CS AS WDHTT RVWD 
VESGSLKSTLTGNKVFNCISYSPLCKRLASGSTDRHIRIiWDPRT 

KDGSLVSLSLTSHTGWVTSVKWSPTHEQQLI SGSLDNI VKLWDT 
P^CKAPLYDLAAHEDKVLSVDWTDTGLLLSGGADNKLYSYRYSP 
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beginning 
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cor re spending 

to first 

amino acid 

residue of 
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sequence 


Predicted end 
nucleotide 
location 
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to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A= Alanine, C= Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, X=L»ysine, 
l*-Leucine, M=Mechionine, N=Asparagine, 
P=Proline, Q=K31ut amine, R=Arginine, 
S=Serine, T~Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X-Unknown, *-Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








TTSHVGA 


6287 


276 


1482 


MQFFFNFQIGI*RSTSGKEKYSGDAGPLGDALQI*FLQCIiALDEDF 
APAKLQVQKILCDLLLPEKLKEGLKESSWSSLPCTKNRPFDFHS 
VMEESQSLNBPSPKQSBEI PEVTSEPVKGSLNRAQS AQS INSTE 
MPAREDCLKRVSSEPVLSVQEKGVLLKRKLSIiLEQDVIVNEDGR 
NKLKKQGET PNEVCM FS LAYGD I PEE1*I D VS DFECS LCMR LFFE 
PVTTPCGHS FCKNO J ERCIiDHAPYCPl.Ci<^SLKEYlJu0RRYCVT 
QLIjEELI VKYLPDE LSERKK I YDEETAELSHI/TKNVP I FVCTMA 
YPTVPCPLHVFEPRYRLMIRRSIQTGTKQFGMCVSDTQNSFADY 
GC^n.QIRNVHFLPDGRSVVDTVGGKRFRVLKRGMKDGYCTADI E 
YLEDV 


6288 


1 


743 


VTLYPCRGLVGNLLIX3ASGMASGCKIGPSILNSDLANU5AECLR 
MLDSGADYLHLDVMDGHFVPNITFGHPVVESLRKQIX3QDPFFDM 
HMMVSKPEQWVKPMAVAGANQYTFHLEATENPGAL.IKDIRENGM ■ 
KVGI4AJCKPGTSVEYTjAPWANQIDMALVMTV^PGF<^QKFWEDMM 
PKVHWLRTQ FPSLD I EVDGGVGPDTVHKCAEAGANM IVSGSAIM 
RS EDPRSVINTjLRNVCSEAAQKRSLDR 


6289 


a 


743 


VTLYPCRGLVGNLLIiGASGMASGCKIGPS I LN S DLANLGAECLR 
MLDSGADYLHLDVMDGHFVPNITFGHPWESI.RKOLGODPFFDM 
HMMVS KPEQWVXPMAVAGANQ YT FHLiEATENPGALI KDIRRNGM 
K^GLAIKPGTSVF^TIAPWANQIDMALVMTVKPGFGGQKFMEDMM 
PKVHWLRTQ FP S LD I E VDGGVG P DTVHKCAEAG ANM I VS G S A I M 
RSED PRSVIN LI>RNVCS EAAQKRS LDR 


6290 


3 

* 


1856 


Tl/SRWLIiGVYETVAPTIJVCLPRPRLRRRRRRRRRRMISRYTRKA 
VPQSLELKG ITKHALNHHPPPEKLEE IS PTSDSHEKDTSSQSKS 
DI TRESS FTSADTGNSLSAFPSYTGAG IS TEGSSDFSWGYGELD 
QNATE KVQTMFTAI DKLLYEQKLS VHTKS LQE E CQQWTAS FPHL 
RILGRQI ITPSEGYRLYPRSPSAVSAS YETTLSQERDSTI FGIR 
GKKLHFSSSYAHKASS IAKSSSFCSMERDEEDS I IVSEGI IEEY 
LAFDH I DI EEG FHGKKSEAATEKQKLG YP P IAPFYCMKEDVLAY 
VFDSVWCKWSCMEQLTRSHWEGFASDDESNVAVTRPDS ESSCV 
LSELHPIiVLPRVPQSKVI* Y I TSNPMS L»CQASRHQPNVNDLI»VHG 
MPLQPRNLSIJ^DKLiDLDDKIjL^PGSSTILSTRNWPNRAVEFS 
TSSLSYTVQSTRRRNPPPRTLHPISTSHSCAETPRSVEEILiRGA 
RVPVAPDSLSSPSPTPIiSRNNLLPPIGTAEVEHVSTVGPQRQMK 
PHGDS SRAQSAWDEPNYQQPQERLLIiPDFFPRPNTTQS FLLDT 
QYRRS CAVEYPHQARPGRGS AGPQLHGSTKSQSGGRP VS RTRQG 
P 


6291 


1732 


602 


LVAKMASSASARTPAGKRVINQEELRRLMKEKQRLSTSRKRIES 
PFAKYNRLGQLS CALCNTPVKSEIjLWQTHVIiGKQHREKVAELKG 
AKEASQGSSASSAPQSVKRKAPDADDQDVKRAKATLVPQVQPST 
SAWTTNFDKIGKEFIRATPSKPSGLSI*LPDYEDEEEEEEEEEGD 
GERKRGDASKPLS DAQGKEHS VS SSREVTSS VLPNDFFSTNP P K 
API I PHSGSIEKAEIHEKVVERRENTAEALPEGFFDDPEVDARV 
RKVDAPKDQMDKEWDEFQKAMRQVNTISEAIVAEEDEEGRLDRQ 
IGEIDEQIECYRRVEKLRNRQDEIKNKbKEILTIKELOKKEEEN 
ADS DDEGELQDLLSQDWRVKGALL 


6292 


1835 


1142 


TCPGAMKMVAPWTRF YSNS CCLCCHVRTGTI LLG VW YL I INAW 
LL I LLSALADPDQ YNFSSSEIiGGDFEFMDDANMC IAI A ISLXM I 
h I CAMATYG A YKQRAAW 1 1 PFFCYQ I FD FALNMLVAI TVL IYPN 
SIQEY IRQLPPNFPYRDDVMSVNPTCLVLI I LI*FI S I ILTFKGY 
LI SCVWNCYRYINGRNSSDVLVYOTSNDTTVLLPP YDDATVNGA 
AKEPPPPYVSA 


6293 


2382 


1035 


FWCTLGT VDVHPIGWCAI NS K ILVPPRT IHAKFTDWKGYLMKRI> 
VGSRTIiP VD FHIKMVESMKYP FRQGMRIiE WDKSQ VS RTRMAW 
DTVIGGRIiRLIiYEDGDSDDDFWCHMWSPLIHPVGWSRRVGHGIK 
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sequence 


Amino acid segment containing signal peptide 
(A«Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine. 
H=Histidine, I=Isoleucine, K= Lysine, 
L= Leu cine, M^Methionine, N=Asparagine, 
P=Proline, Q^Glutamine, Rs=Arginine, 
S=Serine, T=Threonine, V»Valine, 
W =Tryp t ophan , Y=Tyrosine, X*=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








MSERRSDMAHHPTFRKI YCDAVP YLFKKVRAVYTEGG WFEEGMK 
IiEAIOPLNLGNI CVATVCKVIjIiDGYLM I CVDGG P STDGLDWFCY 
HASSHAI FPATFCQKND I ELT P PKG YE AQTFNWENYIjE KTKS KA 
APS RLFNMDCPNHG FKVGMKLEAVDLME PRLI CVATVKRVVHR L 
LS IHFDGWDS EYDQ WVDCESPD I YP VG W CELTG YQLQ P PVAAEP 
ATPLKAKEATKKKKKQFGKKRKRIPPTKTRPLRQGSKKPIjLEDD 
PQGARKISSEPVPGEIIAVRVKEEHLDVASPDKASSPELPVSVE 
NIKQETDD 


6294 


354 

* 


1814 


AQLTTRGRTVAGG VRW I PS PF PDLELYSCCLGTDRGFPEIiSHHC 
KNVIATASDYDMAEITNIRPSFDVSPVVAGIiIGASVLVVCVSVT 
VFVWSCCHQQAEKKHKNP P YKF IHMLKG I S I YPETLSNKKKI I K 
VRiU3KDGPGREGGRRNIJjVDAAEAGLLSRDKDPRGPSSGSCIDQ 
LP IKMDYGEELRSPITSIjTPGESKTTSPSSPKISDVMIjGS LTFSV 
DYNFPKKALWTIQFJ^GLPV^DQTQGSDPYIKMTILPDKRHR 
VKTRVLRKTLDPVFDETFTFYGIPYSQLQDLVLHFL.VLSFDRFS 
RDDVTGEVMVPIiAGVDPSTGKVQLTRDI I KRNIQKCISRGELQV 
SLS YQPVAQRMTVWL KARHLQKMD I AGLSGNP YVKVNVYYGRK 
R I AKKKTHVKKCTLNP I FNES P I YDI PTDLLPDIS1EFLVIDFD 
RTTKNEWGRIj I LGAHS VTASGAEHWREVCES PRKPVAKWHSLS 
EY 


6295 


2795 

• 


617 


VSSALLTGATSGSElAAKSEGASASPLSCTNAVAMDRPDEGPPAK 
TRRLSSSES PQRDPPPP P PPPPIiLRIxPLPPPQQRPRLQEETEAA 
QVLADMRGVGIjGPAIiPPPPPYVILEEGGIRAYFTLGAECPGWDS 
TIESGYGEAPPPTESLEALPTPEASGGSLEIDFQWQSSSFGGE 

RRRRRRRRKQRKVKRESRERNAERMES ILQALEDIQLDLEAVNI 
KAGKAFLRLKRKFIQMRRPFToERRDLIIQHIPGF^VKAFIiNHPR 
1 S ILINRRDEDI FRYI»TN1»QVQDLRH I SMGYKMKLYFQTNP Y FT 
NMVIVKEFGRNRSGRJbVSHSTPIRWHRGQEPQARRHGNQDASHS 
FFSWFSNHSLPEADR IAE I IKNDLWVNPLRYYLRERGSRI KRKK 
QEMKKRKTRGRCEWIMEDAPDYYAVEDI FSEI SDIDETIHDIK 
ISDFMETTDYFETTDNEITDINENI CDSENPDHNEVPNNETTDN 
NE^ADDHET^TONNESADDNNEWPEDNNKNTDDNEENPNNNENTY 
GNNFFKGGFWGSHGNNQDSSDSDNEADEASDDEDNDGNEGDNEG 
SDDDGNEG DNEGSDDDDRD I EYYEKV I EDFDKDQADYEDVI E 1 1 
SUES VEEEGIEEG 1 QQDED I YEEGNYEEEGSEDVWEEGEDSDDS 
DLEDVLQVPNGWANPGKRGKTG 


6296 


727 


1199 


RHCGCDAQGACDSIjPPTGTSSPVTARJNAI PEARCCVWLLDGTTV 
EAVRPARERLARKELRQKRMQQFSRDSAYSSNKDSTCLLTERDT 
LGTSIjQ FPS PFSGTI S FGS FS DSG I FPLGSQCCLG FQQ FS I SGK 
KWALIHKRVRLSVFGARWGRI YFGK 


6297 


1 


922 


QRAAAAS PS SCGPRGA£YGAIiMAMEG YWRPLALIjGS ALiLVG FLS 
VIFAI>VWVLHYREGIjGWIX3SALEFNWHPV1iMVTC 
VYRIiPWTWKCS KLLiMKS I HAG LNAVAAI LAI I S WAVFENHNVN 
NI ANMYSLHS WVGI>IAVI C YLLQLIjSGFS VFLL PWAPLS LRAFL 
MP IHVYSG I VI FGTV I ATALMGLTEKL I FSLRDPAYS TFPPEGV 
FVNTLGLL I LVFGALI FW I VTR PQWKRP KEPNSTI LHPNGGTEQ 
GARGSMPAYSGNNMDKSDSEIJmEVAARK3WLAIiDEAGQRSTM 


6298 


3 


985 


SVPLRR^SI^GTtWAGTTTKT^VARIiAAVAAWVPCRSWGWAAV 
PFGPHRGLSVLIARI PQRAPR WLPACRQKTSLSFLNR PDLPNLA 
YKKLKGKS PGI I FI PGYIiSYMNGTKAIAI EEFCKSLGHACIRFD 
YSGVGSSDGNSEESTLGKWRKDVLSI I DDLADGPQ I LVGS SLGG 
VJLt^HAAIARPEKWALIGVATAAIOTjVTKFNQLPVELKKEVEM 
KGVWSMPS KY SEEG VYNVQYS F 1 KEAEHHCLIjHSP I PVNCPIRI* 
LHGMKDDI VPWHTSMQ VADRVLSTDVDV I LRKHSDHRMRE KADI 
QLLVYTIDDLIDKIJST I VN 
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ID 
NO: 


Preaxcteo 
beginning 
nucleotide 
location 
corresponding 
co iirsL 
amino acid 
residue of 
amino acia 
sequence 


FTcaiCbcu cHQ 

nucleotide 
location 
corresponding 
to first 

ami r> <~\ ari t\ 

residue of 
amino acid 


Aaino acia segment containing signal peptide 
(A-Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I»Isoleucine, K=I»ysine, 
L-Leucine, M=Methionine, N^Asparagine, 

S^Serine, T-Threonine , V^= Valine , 
W=Tryptophan, Y ^Tyrosine, X=Unknown, *-Stop 

V^WUW*! , / a^UooxUi.C UUVlCULJLUC UCXCUXUli r 

\=possible nucleotide insertion) 






a is 


RfHT .'PYSTM'D'KfVPT^T JCT.OTTsVSCDT i^nTTAJUDPVTCT T">Q & TT.TCC 

a viiiiE»u mr u v J. 101*0 iix^ ir<uo iri_i\^L/uu vxiir v_ v i oiiuoiu ii i do 
SIDAMDI^AFSGPYKFPFTPPLESFNLCFYTSQVPVPPILGFYQ 
MKEEEVQLRNNH 


6300 


121 


692 


AAPSCWSQRGVPAAGTPSS PRLLVSRAAAPSAGPW3AWRQGARA 
AQSPFSIPNSSSVPYGSQDSVHSSPSDGGGGRDRPVGGSPGGPR 
LVIGSLPAHIjS PHMFGGFKCPVCSKFVSSDEMDLHLVMCLTKPR 
ITYNEDVLS KDAGECAI CLEELQQGDTIARLPCLCI YHKGCIDE 
WFEVNRSCPEHPSD 


6301 


616 


284 


GKFVPVNWEPPQPLFFPKyLRCYRCT.T.F.TKEjOGCLljGSDICLTP 
AGSSCITI^KKNSSGSDVMVSIX^SKEQMSDCSNTRTSPVSGFW ! 
IFSQYCFLDFCNDPQNRGLYTP 


6302 


490 


74 5 


I FGFLHLFHMEHS FDl,VCALFAHVFFS SSCGSS VALHSDPC1ULS 
PVliLNCIJPGDLRPl^EIi YAQBQjKYKAI SEELDHAUTOMTS L 


6303 


2 


1961 


YKNEYGGGI*LWQSWQEKHPGQAI>SSEPWNFPDTKEEWEQHYSQL 
YWYYLEQFQYWEAQGWTFDASQSCDTDTYTSKTEADDKNDEKCM 
KVDbVS FLSS P IMGDNDSSGTSDKDHS EI LDG I SNI KLNSEEVT 
QSQLDSCTSHDGHQQLSEVSSKRECPASGQSEPRNGGTNEESNS 
SGNTNTDPPAEDSQKSSGANTSKDRPHASGTDGDESEEDP PEHK 
PSKLKRSHEIiD IDENPASDFDDSGSLLGFKYGSGQKYGG I PNFS 
HRQ VR YLEKKVKL KSKYIiDMRKQIKMKNKHIFFTKES EKPFFKK 
SKILSKVEKFI>T17VNKPMDEEASQESSSHDNGHDASrSCDSEEQ 
DMS VIQCGDDLLETNNPE PEKCQSVSS AGELETEN Y ERDS l»IiATV 
PDEODCVTQEVPDSRQAETEAEVKKKKNKKKNKKVNGLPPEIAA 
VPELAKYWAQRYRLFSRFDDGI KLDREGWFSVTPEKIAEH IAGR 
VSQS FKCDVWDAFCGVGGNTIQFALTGMRVIAIDIDPVKIALA 
RNNAEVYG I ADKI E FICGDFT iT J »AS FI»KAD WFLSP P WGGPD YA 1 
TAETFDIRTMMSPDGFEI FKLSKKITNNIVYFIiPRNADIDQVAS 
LAGPGGQVEIEQNFIiNNKLKTITAYFGDIjIRRPASET 


6304 


1 


1438 


HRAR VDRS RES PGGDLRHPGRVRRD I TIiSGHPRTjSTQH VVLLRE 
DEVGDPGTKDIjGHPQHGS P IQETQSE WT1*VSPIjPGSDMAAI»PA 
WRATS GLTL WPHTAEGRDLLGAENRAIiTGGQQAEDPTIASG AYQ 
WPGS VEKI*QGS VV7CDAETLLS S SRTGGQ AP P WI/TOHD VQMLRJjIi 
AQGEWDKARVPAHGQVLQVG FSTEAALQDLSS PRLSQLCSQGL 
CGLIKRPGDLPEVl^FHVDRVLGLRRSLPAVT^RRFHSPLLPYRY 
TIXSGARPVIWWAPDVQHI^DPDEDO^SIiAI^IjQYQALLAHSC^ 
WPGQAPCPGIHHTEWARIiAIiFDFLIiQVHDRIiDRYCCGFEPEPSD 
P CVEERIjREKCRNP AEIiRLVH 1 LVRS S DPSHLiVY 1 DNAGNLQHP 
EDKLNFPJLLEGIXKSFPESAVKVIiASGCLQNML.LKSI^ 
SQGGAGXSLKQVLOTLEQRGQVLI^HIQK^ 


6305 


59 


420 


NMIWRGRSTYRPRPRRSVPPPELIGPMLEPGDEEPQQEEPPTES 
RDPAPGQEREEDQGAAETOVPDLEADIiQELSQSKTGDECGDGPD 
VQGKILTKSEQFKMPEGR 


6306 


1 


1874 


PTRPSKVKVPHTFLIHSYTRPTVCQACKKLLKGLFRQGLQCKDC 

ESSDSGV2 PGS HSEN AJjHAS EEEEGEGG KAQS SLGYI PLMRWQ 

SVRHTTRKSSTTIJ^EGWVVHYSNKDTLRKRHYWRIjDCKCITLFQ 

NNTTNRYYKE IPLSEI LTVESAQNFSLVPPGTNPHCFEI VTANA 

TYFVGEMPGGTPGGPSGQGAEAARGWETAIRQALMPVILQDAPS 

APGHAPHRQAS hS I S VSNSQ I QENVDIATVYQ I FPDBVLGSGQF 

GVVYGGKHR KTGRI)VAVKVIDKIjRF PTKQBS QLRKE VAI LQS LR 

HPG I VNLECMFET PEBTVTVVMEKLHGDMIi EM ILSSEKGRLPERL 

TKFLITQILVALRHLHFKNIVHCDLKPENVLI^ASADPFPQVKIiC 

DFGFARI IGEKSF^RSWGTPAYIJ^EVLLNQGYNRSJaDMWSVG 

VIMWSLSGTFPFNEDEDINDQIQNAAFMYPASPWSHISAGAID 

LINNIiQVK^rO^YSVDKSI^HPWl^F^QTWlJ) 

Y ITHESDDAR WEQFAAEHPIUPGSGLPTDRDTjGGACPPQDHDMQG 
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ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


; Predicted end 
nucleotide 
location 
corresponding 
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amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C-Cysteine, D=*Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine , 
H=Hiatidine, 1=1 soleucine , K« Lysine, 
L=Leucine, M^Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=lfnknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








IxAERXSVL 


6307 


2136 


589 


CFLLPRGRDPEPPEAGAAAPCAPGAPDMSFRXVVRQSKFRHVFG 
Q PVKNDQCYED IRVS RVTWDST FCAVNP KFIAVT VKASGGGAFI* 
VLPLSKTGRI DKAYPTVCGHTG PVLDIDWCPHNDEV IASGSEDC 
TVMVWQI PENGLTS PLTEP WVLEGHTKRVG 1 IAWHPTARNVLL 
SAGCDNVVLZTWVGTAEELYRLDSI^PDLiyNVSWNHNGSLFCS 
ACKDKS VRI IDPRRGTLVAEREKAHEGARPMRAI FLADGKVFTT 
GFSRMSERQLAIiWDPENLEEPMAIiQEIJ^SSNGALLPFYDPDTSV 
VYVCGKGDSS I R YFE I TEE PPY I HFTjNTFTSKE PQRGMG SM PKR 
GLEVS KCE IARFYKLHERKCEP I VMTVPRKSDL FQDDLY PDT AG 
PEAALEAEEWVSGRDADP I LI SLREAYVPSKQRDLKIS RRNVLS 
DS RP AMAPG S S HLGAPAS TTTAADAT P SGS LARAG EAG K LE EVM 
QELRALRALVKEOGDRI CRLEEQLGRMENGDA 


6308 


2 


1118 


GRPTRPEK^U^SLVLHTYSMRYLLPSVVL1X3TAPTWLAWG\WR 
LLSAFLPARFYQALDDRLY CVYQSM VLFFFEN YTG VQ I LL YGDL 
PKNXENT IYLANHQSTVDW I VADILAIRQNALGHVR YVLKEGUC 
WLPLYGWYFAQHGGIYVKRSAKFNEKEMRNKLQSYVDAGTPMYL 
VIFPEGTR YITOEQTKVLS ASG^IFAAQRGLAVLKHVLTPR I KATH 
VAFDCMXNYLDAIYDVTVVYEGKDDGGQRRESPTMTEFLCKECP 
KIHIHIDRIDKK3)VPEEQEHMRRWLHERFEIKDKMI,IEFYESPD 
PERRKRPPGKS VNS KLS I KKTLPSML I LSGLTAGMI*MTDAGRKL» 
YVNTWIYGTLLGCLWVTIKA 


6309 


220 


563 


LVAEVKEPCSLPMLSVDMENKENGSVGVKNSr4ENGRPPDPADWA 
VMD WNY FRTVG F EEQAS AFQEQE I DGKSLLLMTRNDVLTGLQL 
KLGPALKI YEYHVKPLQTKHLKNNS S 


6310 


36 


979 


GPRCWKFLILSSVNCETLRIGKAWPQSSGQERYWTPRTHSSAS3 
AQRGS LAELJsTVAAAGLWADCDQPLYPCPMCGL I CTNYHI LQEHV 
DLHLEEJSJS FQQGMDRVQCSGDLQLAIIQL.QQEEDRKRRS EESRQE 
IEEFQKLQRQYGLDNSGGYKQQQLRKME IEVNRGRMPPSEFHRR 
KADMMESIiALGFDDGKTKTSGI I EALHR YYQNAATD VRRVWLS S 
WDHFHSS LGDKGWGCGYRNFQMLLS S LLQNDAYNDCLKGM LIP 
CIPKIQSMIEDAWKEGFDPQGASQLIIRLQGTKAWIGACEVYIL 
LTSLRV 


6311 


1 


675 


PVWWNSCEGPRIJWUUiTGHGVGRRARLACLGEPRVKAAVKLTL 
AS KLKRDDGLKGS RTAATAS D S TRRVS VRDKLLV KEVAELtEANL 
PCTCKVHFPDPNKLHCFQLTVTPDEGYYQGGKFQFETEVPDAYN 
MV?PKVKCLTKIWHPNITETGEICI>SLLREHSIDGTGWAPTRTL 
KDVVWGI^SLFTDIJiNFDDPI^IEAABHHLRDKEDFRNKVDDYI 
KRYAR 


6312 


213 


1400 


GDELVKREAGMKMLPGVGVFGTGSSARVLVP1ARAEGFTVEALW 
GKTEEEAKQLAEEMNIAFYTSRTDDILLHQDVDLVCI S I PPPLT 
RQI SVKALGI GKNVVCEKAATS VDAFRMVTASRYYPQLMS LVGN 
VLRFLPAFVRMKQLI SEHYVGAVMI CDARI YSGSLLSPS YGWI C 
DEIiMGGGGLHTMGTYIVDLLTHLTGRRAEKVHGUiKTFVRQNAA 
IRG IRHVTSDDFCFFQMIi^GGGVCSTVTLNFNMPGAFVHEVMW 
GSAGR L VARG ADLYGQ KNS ATQEELLLRDS LAVGAGLP EQGPQD 
VPIJjiYLKGM V YMVQALRQS FQGQGDRRTWDRT P VSMAAS FEDGL 
YMQSWDAI KRS SRS^E WEAVEVLTEE PDTNQNLCEALQRNNL 


6313 


2 

* 


2071 


qrsgaarlaflpspfs pacvhrs plsfhgcwfyfvwfmplgvl 
fhrrrahgctlscssfveoptameaeetmeclqefpehhkmild 
rlneqreqdrftditlivdghhfkahkavlaacskffykffqef 
tqeplveiegvskmafrhlief^ytakiimiogeeeandvwkaae 
flqmleai kalevrnkensapleenttgkneakkrkiaetsnv i 
tes lps aesep ve i eve 3caegt i evedeg ietleevasakqs vk 
yiqstgssddsalalladitskyrqgdrkgqikedgcpsdptsk 
qvegieivelqlshvkdlfhcekcnrsfklfyhfkehmkshste 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corre sponding 
to first 
amino acid 
residue of 
amino acid . 
sequence 


Amino acid segment containing signal peptide 1 
(A= Alanine, C-Cy^tei* 1 ®* D=Aspartic Acid, E= j 
Glutamic Acid, F= Phenylalanine, G^Glycine, 
H=Histidine, l=Isoleucine, K=Lysine, 
Ii=I*eucine. M=Methionine, N=Asparagine, 
P= Proline, Q=Glutaraine, R=Arginine, 
S-Serine, T=Threonine, V=Valine # 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=pos3ible nucleotide insertion) 








S FKCE I CNKRYLRESAW KQHLNCYHLEEGGVS KKQRTG KK.IHVC 
Q YCE KQFDHFGHFKEHLRKHTGE KP FECFNCHERFARNS TLiKCH 
IjTACQTGVGAKKGRKKL.YECQVCN S VFNSWDQFKDHL VI HTGDK 
PNHCTLCDLW FMOXSNEIJIRHLSDAHNISERLVTEEVLSVETRVQ 
TEPVTSMTI I EQVG KVHVLPIJ^VQVDS AQVTVEQVHPDLLQDS 
Q VHDSHMS E L P EQVQVS YLEVGR I QTEEGTEVHVEEIiHTVE RVNQ 
MPVEVQTEIAEADLDHVTPEIMNQEERESSQADAAEAAREDHED 
AEDLETKPTVDSEAEKAENEDRTALPVLE 


6314 


2 


2071 


QRSGAARJLAFLPSPFSPACVHRSPI^SFHGCWFYFWVFMPLGVL 
FHRRRAHGCTLSCSSFVEQPTAMEAEETMECLQEFPEHHKMILD 
RIiNEQREQDRFTDITI*IVIX3HHPKAHKAVIiAACSKFpyKFFQEF 
TQEPLVE I EGVS KMAPRHL I EFTYTAKLMIQGEEEANDVWKAAE 
FLQWLEAI KAI*EVRNKENSAPI*EENTTGKNEAKKRKIAETSNVI 
TESLPSAESEPVEIEVE I AEGTIEVEDEGI ETLEEVASAKQSVK 
YIQSTGSSDDSAIiALIiADITSKyROGDRlOSQIKEDGCPSDPTSK 
QVEG I E IVELQLSHVKDIiFHCEKCNRS FKLFYH F KEHMKS HSTE 
SFKCEICNKRYLRESAWKQHbNCYHLEEGGVSKKQRTGKKIHVC 
QYCEKQFDHFGHFKEHLRXHTGEKPFECPNCHERFARNSTLKCH 
LTACQTGVGAKKGRKKLYECQVCNSVFNSWDQFKDHLVIHTGDK 
PNHCTLCDLWFMQGNEI^RRHLSDAHN I SERLVTEEVLSVETRVQ 
TEPVTSMTIIEQVGKV/HVIiPIjIjQVQVDSAQVTVEQVHPDIilfQDS 
QVHDSHMSEiPEQVQVSYLEVGRIQTEEGTEVHVEELHVERVNQ 
MPVEVQTELtiEADIiDHVTPE IMNQEERESSQADAAEAAREDHED 
AEDIiETKPTVDSEAEKAENEDRTALPVLE 


6315 


1 


1015 


I^IiAVNVVTTLVLISYCPTATEEAPYWTYLLCAIXSLFIYQSLDA 
IDGKOARRTNS CS PLGEJLFDHG CDSI*S TVFMAVGAS IAARLGTY 
PDWFFSCSFIGMFVFYCAHWQTYVSGMLRFGKVDVTEIQIAIiVI 
VFVI>SAFGGATMWDYTI P I LEI KLKI L P VLGFLGG VI FSCSNYF 
HVILHGGVGKNGS T I AGTSVIjS PGLH I GLI I ILAI MI YKKSATD 
VFEKHPCLYILMFGCVFAKVSQKLWAHMTKSELYIrQDTVFI^GP 

gi*iifi»dqyfnnf ide yvvlwmamv i s s fdm vi yfsabclq isrh 
LHLlNI fktachqapeqvqvlsskshqnnmd 


6316 


1503 


792 


VS AGAGTG I MGGTTSTRRVT FEADENEN I TWKGI RI»S ENV I DR 
MKESSPSGSKSQRYSGAYGASVSDEELKRRVAEBLALEQAKKES 
EDQKRIjKQAKELDRERAAANEQIjTRAI LRERI CSEEERAKAKHIi 
ARQLEEKDRVIJCKQDAFTKEQLARI^ERSSEFYRVTTEQYQKAA 
EEVEAKFKRYESHPVCADLQAKIIj0^r5fRENTHO/rL.KCSAIiATQY 
MHCVNHAKQSMIiEKGG 


6317 


102 


839 


PEAQTSAVLAREKGRLPTMRHEAPMQMASAQDARYGQKDSSDQN 
FDYMFKLLI IGNS SVG KTSFLFRYADDS FTSAFVST VG I DFKVK 
TVFKNEKRI KLQ I WDTAGQERYRTITTAYYRGAMG F I LMYDI TN 
E E SFNAVQD WSTQ I KT YSWDNAQVTI»VGNKCDMEDERVI STERG 
QHLGEQLGFEFFETS AKDN INVKQTFEKLVDI I CD KMSESLETD 
PAITAAKQNTRIiKETPPPPQPNCAC 


6318 


1765 


733 


P WH PLRTIiP LHH PH PR PPRA EGREGADSMSHLPGLE JjRREAP P I* 
LGPIiSPFPLPAGSWHRQMIJiSSLRFPITNSAGAPCKAAGRMNI 
LAPVRRDRVXjAELPQCIiRKEAAl»HGHKD FHPRVTCACQEHRTGT 
VGFKI S KVIWGDLS VGKTCLINRFCKDT FDKNYKAT I GVDFEM 
ERFEVLG I PFSLQLWDTAGQERFKCI ASTYYRGAQAI I IVFNLN 
D VASIiEHTKQWIADAIjICENDPS S VLLFLVGS KKDLSTPAQYALM 
EKDALQVAQEMKAEYWAVSSLTGENVRBFFFRVAALTFEANVLA 
E L.EKSGARRIGDWR INSDDSNL YLTASKKKPTCCP 


6319 


88 


717 


AATMRLNQNTLLLG KKWL VP YTS EHV PS R YHEWM KS E E LQRLT 
AS EPI*TLEQE YAMQCS WQEDADKCTFI VLDAEKWQAQPGATEES 
CMVGDVNLFLTDIiEDIjTIjGEIEVM I AEPSCRGKGIiGTEAVLAMI* 
SYGVTTLGLTKFEAKIGQGNEPS IRMFQKLHFEQVATSSVFQEV 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C-Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, i=lsoleucine, K=Lysine, 
ULeucine, M=Methiqnine, Ns=Asparagine, 
P= Proline, Q^Glutamine, R=Arginine, 
S=Serine , T=Threonine , V= Valine , 
W=Tryptophan, Y=Tyrosine r X*Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








TLRLTVSES EHQWIOiEOTSHVEEKP YRDGSAE PC 


6320 


90 


1111 


RPRTGRE KVAMAAV DS FYLLYRE IARSCNCYMEALALVGAWYTA 
RKSI TV3 CDFYSLI RLHFIPRLGSRADLIKQYGRWAWSGATDG 
I G KA Y AE EliAS RGLN I ILISRNEEKI^VVAKDIADTYKVETDI I 
VADFSSGRE IYI^IREALKDKDVGILVNNVGVTYPYPQYFTQL>S 
EDKLWDI INVNIAAASLMVHVVLPGMVERKKGAIVTISSGSCCK 
PT P QLAAF S AS KA YLDHFS RALQ YEYAS KG I FVQSLIPFYVATS 
MTAP SNFLHRCS WLVPS P KVYAHHAVSTLG I S KRTTGYWSHS IQ 
FLFAQYMPEWLWVWGANI UraSLRKEALSCTA 


6321 


141B 


341 


HRKAALGALMAGRLLGKALAAVS LSLALASVTI RSSRCRG IQAF 
RNS FS SS W FRX»NTNVMSGSNGS KE1TSHNKARTS P YPGSKVERS Q 
VPNEKVG W LVEWQDYKPVE YTAVS VTjAG PRWADPQIS ESNFS PK 
FNE KDGHVERKS KNG LYE I ENGR P RN P AGRTGLVGRGLLGRWG P 
NHAADPI T TRWKRDSSGNKIMHP VSGKH I LQFVAI KRKDCGEWA 
I PGGMVDPGEKI SATLKREFGEEALNSU3KTS AEKREIEEKLHK 
LFS QDHLVI YKGYVDDPRNTDNAVJMETEAVNYHDETGEIMDNLM 
LEAGDDlAGKVKWVDINDKIiKLYASHSQFIKXVAEKRDAHWSEDS 
EADCHAL 




si \J*t. 1 


\ OR"* 

JLWOJ 


NOE I LKNVESS RTVOPHFLEFLLS LGWSVDVGRH PGWTGHVSTS 
WS INCCDDGEGSQQEEVISSEDIGASI FNGQKKVLYYADALTEI 
AFWPSPVESIjTDSLESNISDQDSDSNMDLMPGIIiKQPSLTI»EI* 
FPNHTDNLNSSQRLSPSSRMRiCLPOGRPVPPLGPETRVSVVWVE 
RYDDIENFPLSELMTEISTGVETTANSSTSLRSTTIJ2KEVPVIF 
IHPLNTGLFR I KIQGATGKFNMVIPLVDGMI VSRRALGFLVRQT 
VIN I CRRKRI»ES DSYS P PHVRRKQKITDI VNKYRNKQLE PEF YT 
SLFQEVGLKNCSS 


6323 


1 


656 


PASTTDGAQEARVPLDGAFW I PRP PAGS PKGCFACVSKP PALQA 
PAAPAPEPS AS P PMAPTLFPMES KS S KTDSVRAAGAP PACKHLA 
EKKTMTNPTTVI EVYPDTTEVNDYYLWS IFNFVYLNFCCLGFIA 
LAYS LKVRDKKliLNDLNGAVEDAKTDRLIN I TRS GLAAS C I ML W 
MALSVIATHRGLRSSASILVAEPHDWOTERPQVTFRERCPAL 


6324 


1 


2 061 


EGAGMRRC PCRGS LNEAEAGAL PAAARMGLEAP RGG RRRQ PGQQ 
RPGPGAGAPAGRPEGGGPWARTEGSSLHSEPERAGLGPAPGTES 
PQAE FVJTDGQTEPAAAGLG VETERPKQKTEPDRS SLRTHLEWSW 
S ELGTTCLWTE TGTDGLWTD PHRS DLQ FQ PEEAS P WTQ P GVHG P 
WTELETHGSQTQPERVKSWADNLWTHQNSSSIiQTHPEGACPSKE 
PS ADG S WKEL YTDGSRTQQD I EG P WTE P YTDGS QKKQDTEAAR K 
QPGTGGFQIQQDTDGSWTQPSTDGSQTAPGTDCLLGEPEDGPLE 
EPEPGELLTHLYSHLKCSPLCPVPRLI I TPETPE PEAQPVGPPS 
RVEGGSGGFSSASSFDESEDDWAGGGGASDPEDRSGSKPWKKL 
KTVLKYS PFWS FRKHYPWVQLSGHAGNFQAGEDGRXLKRFCQC 
EQRSLEQLMKDPIiRPFVPAYYGMVLQIX3G/rFNQMET)LLADFEGP 

s xmix:kmgsrtyi*eeelvkarerprprkdmyekmvavdp 

EEHAQGAVTKPRYMQWRETMSSTSTLGFRIEGIKKADGTCN*rNF 
KKTQALEQVTKVLEDFVDGD1TVII/3KYVACLEELREALEI S PFF 
KTHEWGSSLLFVHDHTGIJIKVWMIDFGKTVALPDHQTLSHRLP 
WAEGNREDG YLWGLDNM I CLLQGLAQ S 


632S 


165 


944 


GLRDP FRRKRRLKPQVKMSNYVNDMWPGSPQEKDSPSTSRSGGS 
SRLSSRSRSRSFSRSSRSHSRVSSRFSSRSRRSKSRSRSRRRHQ 
RKYRR YSRS YSRS RSRSRSRRYRERRYG FTRRYYRS PSRYRSRS 
RSRSRSRGRSYCGRAYAIARGQRYYGFGRTVYPEEHSRWRDRSR 
TRSRSRTPFRLSEKI)RMFXLEIAKTNAAXAIX3TTNIDLPASLRr 
VPS AKET S RG I GV S S NGAKP EVS ILGLSEQMFQKANCQI 


6326 


238 


680 


GEPS PATQQKP SATGAGVLHQHFSSGH I YVLMGLLP P P WT I S FT 
VO/ITLQPPGGLPAAPVSGRMAFEPVGRDIiARRMVPRAGKRTQTL 
| GARRVAACGARPLPEDRRPKSGERXJTVTVAPCWEFVLPSVSLTA 
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ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corre spon di ng 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A= Alanine, C= Cysteine, D=Aspartic Acid, E- 
Glutamic Acid, F=Phenyl alanine, G=Glycine, 
H-Histidine, I=Isoleucine, K=Lysine, 
L= Leu cine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S«Serine, T=Threonine, V-Valine, 
W^Tryptophan, Y=Tyrosine, X= Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\«possible nucleotide insertion) 








QAWGG VGQEAS SGVP 


6327 


1 


1337 


SLARIiAPAGGSVVMPTQQPAAPSTRAPKPSRSbSGSL>CALFSDA 
DSGSGMKAELPP<3PGAVGREMTKEBKI^IiRKEKKO^KKKRKEBK 
GAEPETGSAVSAAQCQGPTRELPESGIQIjGTPREKVPAGRSKAE 
LRAERRAKQEAERALKQARKGEQGGPPPKASPSTAGETPSGVKR 
LPEYPQVDDIJJjRRIiVKKPERQQVPTRKDYGSKVSLFSHIjPQYS 
RQNSLTQ FMS I PSSVIHPAMVRLGLQ YSQGLVRGSNARCIALiR 
ALOQVIQDYTTPPNEELSRDLVNKI>KPYMSFLTQCRPLSASMHN 
AIKFLNKE ITSVGS SKREEEAKSEIiRAAI DRYVQE KI VLAAQAI 
SRFAYQKISNGDVILVYGCSSLVSRILQEAWTEGRRFRVWVDS 
RPWLEGRHTIoRSLVHAGVPAS Y1>L I PAAS YVLPEVS TEEKDS KV 
GGEKV 


6328 


1030 


276 


HASAEVTTAAARGLGAMEEEMHTDAKIRAENGTGSSPRGPGCSL 

RHFACEQNIjLSRPDG S ASFLQGDTS VLAGVYGPAEVKVSKE I FN 

KATLEVILRPKIGIiPGVAEKSRERLIRNTCEAWLGTLHPRTSI 

TVVLfQVVSDAGSLJ^CCIJ'JAACMALVnAGVP 

S DGTLVLD PTSKQEKE ARAVLT FALDSVERKLLMS STKGLYSDT 

ELQQCLAAAQAAS QHVFRFYRE S LQRRYS KS 


6329 


3 


2016 

- 


SSEVAAGGGTRSAMAEGSGEWTVSATGAANGLNNGAGGTSATT 
SNPLSRKLHKI LETRLDNDKEMLEAIiKAI*STFFVENSIiRTRRNI» 
RGDIERKS LAINEEFVS IFKEVKEELES ISBDVQAMSKCCQDMT 
SRIX3AAKEQTQDLIVKTTKLQSESQKLE I RAQVADAFLS KFQLT 
S D EMS LLRGTREGP I TEDFFKALGRVKQ I HNDVKVLLRTNQQTA 
GLE I MEQMALLQETAYER1»YRWAQSECRTLTQES CDVS PVIiTQA 
MEALQDRPVIjYKYTLDEFGTARRSTWRGF IDALTRGGPGGTPR 
P I EMHS HDPIiR YVGDMLAWLHQATASE KEHLEAIjLKHVTTQGVE 
ENT QEWGH I TEX5VCR PI> KVR I EQ VI VAE PG A VIxL YK I S NLLKF 
YHHTISG IVGNSATALLTTIEEMHIaDSKKI FFNSLSI*HASKLMD 
KVELPPPDLGPSSALNQTLMLLREVIiASHDSSVVPLDARQADFV 

QVLS CVIjD PLI>QMCTVS ASNIXSTADMATFMWSfc^ 
EFTDRRIiEMUJFQIEAHLDTLINEQASYVLTRVGLSY I YNTVQQ 
HKPEQGSIAl^PNIJ^SVTLKAAMVQFDRYLSAPDNLL I PQLNFL 
LS ATVKEQ I VKQSTELVCRAYGEVYAAVMNP INEYKDPEN I LHR 
SPQQVQTLLS 


| 6330 


11S1 


333 


FFYYTFYENKTFSRKMVAEKETLSLNKCPDKMPKRTKLLAQQPL 
PVHQPHSLVSEGFTVKAMMKNSVVRGPPAAGAFKERPTKPTAFR 
KFYERGDFP I AUEHDSKGNKXAWKVEIEKLDYHHYIiPIiFFDGLC 
EMTFP YEFFARQGXHDMIiEHGGNKiriP VLPQLI I PIKNALNLRN 
RQVICVTLKVI>QHIlWSAEMVGKAI»VPYYRQILPVI^IFKN^^ 
NS GDG I D YSQQKRENI GDLIQETLEAFERYGGENAF INI KYWP 
TYESCLLN 


6331 


3 


495 


O^GQRVRTRGRRACASATPl»E<3CVDI^YPRTHAAIjIiKVAQMVTIj 
LIAFICVRSSLWTNYSAYSYF^IVVTICDIiIMIIAFYLVHIiFRFY 
RVIiTCISWPLS ELI^TYLIGTLIiLIASI VAAS KS YNQSGIiVAGA 


6332 


1 


878 


.VTESNKFDI>VSFIPLLRERIYSNNQYARQFIISWILV3 J ESVPDi 
NLLD YlxPE I LDGLFQ I LGDNGKE IRKMCEWLGE FLKE I KKNPS 
S VKFAEMAN I L VIHCQTTDDI*IQLTAMCWMREFI QLAGR VMI»P Y 
S S G I LTAVLP CLAYDDR KKS I KEVANVCNQS LM KLVT PEDDE LD 
EIiRPGORQAE PTPDDALP KQEGTAS GE WTPSLHLTS CRGPREPD 
VI GVAI*GPHI*SNQDYTMYVTHTI VAATQRSGSSGS PP FCRQDTG 
KLST^THSQLVKTGTGLEPRQAVSSSH 


6333 


3 


1467 


TRTPSEAEAGGESPQSCVSAAHSDWTAGKPVSKLAPLIPPRSAG 
QPLTFS PSGRQ PLRSLLVGMCSGSGRRRSSLSPTMRPGTGAERG 
GLMMGHPGMHYAPMGMHPMGQRANMPPVPHGMMPQMMPPMGGPP 
KGQMPGMMSS VMPGMMMSHMSQASMQ PALPPGVNSMDVAAGTAS 
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SEQ 
ID 
NO: 


Predicted 

beginning 

nucleotide 

location 

cor re spondi ng 

to first 

amino acid 

residue of 

amino acid 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A= Alanine , C=Cysteine, D=Aspartic Acid," E= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H«Histidine, I^lsoleucine, K=I*ysine, 
L=l*eucine f M=Methionine, N=Asparagine . 
P=Proline, Q=Glut amine , R=Arginine, 
S=Serine, T=Threonine , V=Valine, 
w=Tryptophan, Y=Tyrosine , X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 






• 


GAKSMOTEHKSPDGKTYYYNTBTKQSTWEKPDDIjKTPAEQIJjS k 
CPW K3 Y1CSDSGKP YY YNSQTKESRNAKP KELEDIaEG YQNTXVAG 
SLITKSNLHAMI KAEES S KQEECi'Ti'STAPVPTTE I PTTMSTMA 
AAEAAAAWAAAAAAAAAAAAANAHASTSASNTVSGTVPVVPEP 
EVTSI VATWDNENTVT ISTEE QAQLTS T P A I QDQSVEVS SNTG 
EETSKQETVADFTPKKEEEESQPAJQCTYTWNTKEEAKQAFKEIiL 
KEKRVPSNASWEQAMKMI IND P R Y S ALAKL S EKKQAFNA Y KVQT 
EKK 


6334 


. 17 


644 


GGNPSGRAAGFAAAAMPS S P LRVAWCS SNQNRS MEAHN 1I>SKR 
GFSVRS FGTGTHVKLPGPAPDKPNVYTDFKTTYDQMYNDLLRKDK 
ELYTQNGI IiHMLDRNKRI KPRPBRFQNCKDLFDL1XTCEERVYD 
QWED LNSREQETCQP VH WNVD I QDNHEEATLGAFL I CE LCQC 
IQHTEDMEKE IDELLQEFEEKSGRTFLHTVCFY 


633S 


82 


529 


AARARPGVLCCRI.LGAALGDQSRVEMSYIPGQPVTAVVQRVEIH 
KIJlOGENI>ILGFSIGGGIDQDPSQNPFSEDKTDKGIYVTRVSEG 
GPAE IAGIjQIGDKI>X5WGWDMTMVTHDQARKRIjTKRS eewrl 
L-VTRQSLQKAVQQSMLS 


6336 


1003 


438 


HEPASKGRAEVGNMRLSVAAAISHGRVFRRMGLGPESRIHLLRN 
LXTGLVRHERIEAPWARVDEMRGYAEKLIDYGKJ^DTNERAMRhJ 
ADFVILTEKDLI PKLFX2VLAPRYKDQTGGYTRMLQ IPNRSlrDRAK ; 
MAV I E YKGN CL P P bPL P RRDS HLiTTjIjNQLiLQG XjRQDLRQ S QEAS 
NHSSHTAQTPG I 


6337 


76 


524 


EG 1 QML»SVQPDTKPKGCAGCNRKI KDRYULKAU5KYWHEDCLKC 
ACCDCRliGEVGS TLYTKANL I LCR RDYLRL FGVTGNCAACS KLI 
PAFEMVMRAKDimHUDCFACQLCNQRFCVGDKFFLKNNMILCQ 
TDYBEGLMKEGYAPQVR 


6338 


66 


1349 


APNSESGTQGPLPTPANLFWTRRANPDPTTSMSATDRMGPKAVP 
GIjRLAI*IiLI^lVGTPKSGVQG^EGl^FPEYDGVDRVl^T\^AK^ 
KNVFKKYEVXALLYHEPPEDDKASQRQFE^^ 
KGVGFGLVDSEKDAAVAKKLGLTEVDSMYVFKGDEVIEYDGEFS 
ADTIVEFLIiEJVLEDPVELIEGERELQAFENI EDE I KLIGYFKS K 
DSEH YKAFEE&AEEFHPYI PFFATFDS KGAKKLTUCLNB I DFYE 
AFMEEPVTI PDKPNSEEE IWFVEEHRRSTLRKLKPESMYETWE 
DDMDG IHI VAFAEEADPDGFE FLETLKAVAQDNTENPDLS 1 1 WI 
DPDDFPLLVPTWEKTFDIDI^APQIGVVNVTDADFcLWMEMDDEE 
DLPSAEELEDWLE0VLEGEINTEDDDDDDDD 


6339 


246 


1813 

■ 


NRCDRGGGGQAERQAGOGCRTQGAGPGFG FGHS FFSQGAMKAFH 
TFCWL.LVFGSVSEAKFDDFEDEED I VE YDDNDFAE FEDVMEDS 
VTESPQRVIITEDDEDETTVEIjEGQDENQEGDFEDADTQEGDTE 
SEPYDDEEFEGYEDKPDTSSSKNKDPITIVDVPAHLQNSWESYY 
LEILMVTGLIAY IMNYI IGKNKNS RIiAQAWFNTHREL»LESNFTL» 
VGDIXSTNKEATSTGKI^NQENEHIYNIiWCSGRVCCEGMLIQLRFIi 
KRQDLLNVLARMMRPVSDQVQ I KVTMNDEDMDTYVFAVGTRKAL 

DT KMVHFIjTHYADKI ES vhfs dqfsgp KI MQEEGQP LKLPDTKR 
TLLLTFNVPGSGNTYP KDMBALLPLMNMVI YS IDKAXKFRLNRE 
GKQKADKNPJu^VEENFLKliTHVQRQEAAQSRREEKKRAEKERIM 
NEEDPEKQRRLEEAALRREQKKLEKKQMKMXQIKVKAM 


5340 


2 


583 


EACAHTLSCPAFARIjGRARRRP wmshrts STFRAERS fxsssss 
SS SSTSSSASRALPAQDPPMEKALSMFSDDFGSFMRPHSEPIiAF 
PARPGGAGNI KTLGDAYEFAVD VRDFS PED 1 I VTTSNNH I EVRA 
EKlAADGTV^INNFAHKCQLPEDVDPTSVTSAIiREDGSLTIRARR 

HPHTEHVQQTFRTE IK I 


6342 


2 


£45 


KMAVLSAPGLRGFR I LGLRS S VG PAVQAR G VHQS VATDG P S S TQ 
PAIjPKARAVAPKPSSRGEYWAKIiDDIjVNWARRSSLWPKTFGIaA 
CCAVENMHMAAPRYDMDRFGAA/ERASPRQSDVMXVAGTLTNKMA 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corre sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alan±ne, C=Cysteine, D-Aspartic Acid, E= 
Glutamic Acid, F=» Phenyl alanine, G=Glycine, 
H=Histidine, I«Isoleucine, K= Lysine, 
L=beucine, M=Methionine, N=Asparagine , 
P= Proline, 0=Glutamine, R=Arginine, 
S=Ser ine , T=Threonine , V=Val ine , 
W= Tryptophan, Y=Tyrosine, X=0nknown, *=Stop 
Codon, /sspossinle nucleotide deletion, 
\=possible nucleotide insertion) 








PAIJIKVYDQMPBPRYWSMGSCANGGGYYHYSYSVVRGCDRIVP 
VDIYIPGCPPTAEALLYGHjQLQRXIKRERRbQrWYRR 


6342 


2 


1191 


DPRVRAMLATLARVAAIJ^TCLFSGRGGGRGLWT^^ 
KPLEGVKI IjDLTRVIiAGPFATMNLGDLGAEVI KVERPGAGDDTR 
TWG P PFVGTEST YYLS VNRNKKS I AVNI KDPKGVKI I KEIAAVC 
D VFVENYVPG KLSAMGLGYED IDE IAPHI I YCS I TGYGQTGP I S 

TAHGS IVPYQAFKTKDGYIWGAGNNQQFATVCKILDLPEL)IDN 

SKYKTNHLRVHNRKELIKIIiSEPJEEELTSKWLYLFEGS 

P INNMKNVFAEPQVI^HNGLVMEMEHPTVGKI S VPGPAVR YS KFK 

MSEARPPPIJXK5HTTHILKEVIiRYDDRAIGEI*LSAGVVDQHE*rH 


6343 


2 


936 


GTAMVSDEDELNLLVI WDANP I WWGKQALKESQFTLS KCIDAV 
MVLGNSHLFMNRSNK1AVXA5HIQESRFXYPGKNGRIjGDFFGCP 
GNPPEFNPSGS KDGKYELLTSANEVI VEE I KDLMTKSD I KGQHT 
ETLLAGSIiAKALCYIHRMITKEVKDNQEMKSRIIjVI kaaedsalq 
YmFMNVIFAAQKQNILIDACVLDSDSGIJL.QQACDITGGLYIiKV 
PQMPS LLQYUjWVFIiPDQDQRSOL 1 L PPPVHVDYRAACFCHRNI* 
IBIGYVCSVCXSIFCOTSPICTTCETAFKISLPPVLKAiOCKKIiK 
VSA 


6344 


2508 


147 


TMPTATLGNLRG YG MAS PGLAAP S I>TP PQLATPNLQQFFPQATR 
QSLLG PPPVGVPMNPSQFNLSGRNPQKQARTSS sttpnrxdsss 
QTMPVEDKSDPPEGSEEAAEPRMDTPEDQDLPPCPEDIAKEKRT 
PAPEPEPCEASELPAKRLRSSEEPTEKEPPGQLQVKAOPQARTTT 
VPKQTQTPDI.LPEAIiEAQVIiPRFQPRVLQVQAQVQSQTQPRIPS 
TDTQ vQPKLQ KQAQTQTS PEHtiVIiQQKQVQPQIjQQrAE PQKQ VQ 
PQVQPOAHSQGPRQVQLQQEAEPLKQVQPQVQPQAHSQPPRQVQ 
LOLQKQVQTQTYPQVHTQAQPSVQPQEHPPAQVSVQPPEQTHEQ 
PHTQPQVSLIJIPEQTPVVVHVCGLEMPPDAVEAGGGMEKTLPEP 
VGTQVSMEEIQNESACGLDVGECENRAREMPGVWGAGGSLKVTI 
U^SDSRAFS*rvPLTPVPRPSDSVSSTPAATSTPSKQALQFFCY 
ICKASCSSQQEFQDHMSEPQHQQRIjGE IQHMSQACLLSLLPVPR 
DVLETEDEE P PPRRWCNTCQL YYMGDL I QHRRTQDHKI AKQSLR 
PFCTVCmYFKTPRKFVEHVKSCOTKDKAKELKSliEKEIAGQDE 
DHFITVDAVGCFEGDEEEEEDDEDEEEIEVEEBLCKQVRSRDIS 
REEWKGSETYSPNTAYGVDFLVPVMGYICRICHKFYHSNSGAQL 
SHCKS LGHFENIjQKY KAAKN PS PTTRP VS RRCAINARNALTALF 
TSSGRPPSQPNTQDKTPSKVTARPSQPPLPRRSTRLKT 


6345 


2 


3403 


PRVRTKL I LLVNDKKRYEIIVGGGPKRIjGRDVEMEEM I EQLQE KV 
HELEKQNDTLKNRL I S AKOXJI*C/TQGYRQTPYNNVQSR INTGRRK 
AKEMAGLQECPRKG I KFQDADVAETPHPMFTKYGNSLLEEARGE 
IRNLENVIQSQRGQIEELEHIiAEILKTQLRRKENEIELSLLQIJPi 
EQQATDQRSNIRDNVEMIKLHKQLVEKSNALSAMEGKF IQLQEK 
QRTI^ISHDAIJ^ANGDEI^NMQIiKEQRLKCCSLEKQLHSMKFSER 
RI EELODRINDLEKEREKLKENYDKLYDSAFSAAHEEQWKIjKEQ 

qlkvq iaqletalksdltdkte i ldrlkterdqneklvqenret* 
qlqyleqkqqldblkkriklynqend inadelsealll i kaqke 
qkngdls flvkvds e inkdlersmreixjathaetvqelektrnm 
limqhkinkdyc^fateavtrkmenlg^dyelkveqyvhlldira 
arihkleaqlkdiaygtkqykfkpeimpddsvdefdetihlerg 
ekx.feihinkvtfssevlqasgdkepvtfctyafydfelo/itpv 

VRGLHPEYNFTSQ YLVHVNDLFIjQ YIQKNTI TLEVHQAYSTEYE 
TIAACXJLKFHEILEKSGRIFCTASIilGTKGDIPNFGTVEYWFRL 
RVPMDQAIRLYRERAKALGYITSITFKGPEHMQSLSQQAPKTAQL 
SSTDSTDGNLNELHITIRCCNHI^SRASHLQPHPYVVYTOTFDFA 
DHDTAI I PS SNDPQFDDHMYFPVPMNMDLDRYLKSBSLS FYVFD 
DSDTQENI YIGKVNVPLISLAHDRC ISGI FELTDHQKHPAGTIH 
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SEQ 
ID 
NO: 


Predicted 
beg liming 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corre spond i ng 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A= Alanine , C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Histidine, l=lsoleucine, K» Lysine, 
Lss Leucine , M=Methionine , N=Asparagine , 
PrrProline. OssGlutamine* R=Ar«7inine# 
S=Serine , T=Threonine , V= Valine t 
W=Tryptophan, Y= Tyrosine, X»Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 






• 


VILKWKFAYLPPSGSITTEDIiGNFIRSEEpEVVQRIiPPASSVST 
L VLAPRPKPRQRIiTPVD KKVS FVD IMPHQS DVS QEGS VDEVKEN 
TE KMQQGKDD VSLDSEGQLAEQ S LAS S EDETE I TEDLEP EVE ED 
MSASDSDDCI I PG PIS KN I KQPSEKI R I EI IALS LNDSQ VTMDD 
TI QRLFVECRFYS LPAEETPVSLP KP KSGQWVYYNYSNV I YVDK 
ENNKAKRDILKAI LQKQEMPNRSLRFTWSDPPEDEQDLECEDI 
GVAHVDLADMFQEGRDL I EQNIDVFDARADGEG I GKLRVTVEAL 
HALQSVYKQYRDDLEA 


6346 


2921 


533 


QDRRLLRLELQKTCQPTSTMSGSHTPACGPPSALTPS I W PQE IL 
AKYTQKEESAEQPEFYYDEFGFRVYKEEGDEPGSSLLANSPLME 
DAPQRLRWQAHLEFTHNHDVGDLTWDKI AVSLPRS E KLRSLVLA 
GIPHGMRPQLWMRLSGALQKKRNSELSYRE IVKNSSNDET I AAK 
Q I EKDLLRTMPSNACFASMGS IGVPRLRRVLRALAWLYP E IGYC 
QGTGMVAACLLLFIiEEEDAFWiMMSAI I EDLLPAS YFSTTLLGVQ 
TDQRVLRHLIVQ YL PRLDKLLQEHD I ELSLITLHWFLTAFAS W 
DIKLLLRI WDLFFYEGSRVLFQLTLGMLHLKEEELIQSENSAS 1 
FNTI^DIPSQMEDAEIJUI^VAI^RLAGSLTDVAVETQRRKHLAYL 
IADQGQT.I AGTLTNLSQVVRRRTQRRKSriTALLFGEDDLEAL 
KAKNI KQTELVADLREAILRVARHFQCTDPKNCSVVSRQLPGLIj 
PNTALTPPTPLVGLYSLWQELTPDYSMESHQRDHENYVACSRSH 
RRRAKALLDFERHDDDELGFRKNDI itivsqkdehcwvgelngl 
rgwfpakfvevlderskeys iagdds vtegvtdlvrgtlcpalk 

AL FEHGLKKPSLLGGACHPWLF I EEAAGREVERDFAS VYSRLVL 
CKTFRIiDEDGKVLTPEELLYRAVQSWVTHDAVHAQMDVKLRSL 
ICVGLNEQVLHL WLEVLCS S L PTVEKW YQPWS FLRS PGWVQIKC 
EI^VLCCFAFSLSQDWELPAKREAQQPLKEGVRDMLVKHHLFSW 

DVDG 


6347 


2921 

• 


533 


QDRRLLRLELQKTCQPTSTMSGSHTPACGPFSALTPS IWPQE I L 
AKYTQKEESAEQPEFYYDEFGFRVYKEEGDEPGSSLLANSPLME 
DAPQRLRWQAHliE FTHNHD VGDLTWDKIAVS LPRS EKLRS LVLA 
G I PHGMRPQLW MRLSGALQKKRNS ELS YRE IVKNSSNDET I AAK 
QIEKDLLRTMPSNACFASMGS IGVPRLRRVLRALAWLYPEIGYC 
QGTGMVAACLLLFLEEEDAFWMMSAI I EDLLPAS YFSTTLLGVQ 

DIKLLLRIWDLFFYEGSRVLFQLTLGMLHLKEEELIQSENSASI 
FNTLSD I PSQMEDAE LLLGVAMRLAGSLTDVAVETQRRKHIAYL 
IADQ^QLLGAGT^TNLSQVVRRRTQRRKSTITA 
KAKNI KQTELVADLREAILRVARHFQCTDPKNCSVVSRQLPGLL 
PNTALTPPTPLVGLYSLWQELTPDYSl^SHQRDHEirrVACSRSH 
RRRAKALLDFERHDDDELGFRKNDr ITI VSQKDEHCWVGEIjNGL 
RGWFPAKFVEVLDERSKEYS IAGDDSVTEGVTDLVRGTLCPALK 
ALFEHGLKKP S LLGG ACH PWL F I E EAAGREVERD FAS VYS RL VL 
CKTFRIjDEDGKVLTPEELLYRAVQS VNVTHDAVHAQMDVKIiRS L 
ICVGIiNEQVLHLWLEVIiCSSLPTVEKWYQPWSFIiRS PGWVQIKC 
ELRVLCCFAFS LS QDWEL PAKREAQQPLKEGVRDMLVKHHIjFS W j 
DVDG 


6348 


3 


3679 


AGAEKCFVTIJ^CFIAKQQNKYKYEECKDLIKSMLRNELQFKBE 
KLAEQLKQAEELRQYKVLVHSQERELTQLREKLREGRDASRSLN 
EHLQALLTPDE PDKSQGQDLQEQLAEGCRLAQHL VQKLS PENDN 
DDDEDVQVEVAEKVQKSSSPREMQKAEEKEVPEDSIiEECAITCS 
NSHGPCDSNQPHKNIKITFBEDEVNSTLVVDRESSHDECQDALN 
ILPVPGPTSSATNVSMVVSAGPLSGEKAAINILEINEKLRPOLA 
EKKQQFPJiLKEKCFLTQI^CFLAJNQQNKYKYEECKDLIKFMIiRN 
ERQ FKE EKIiAEQLKQAEELRQ YKVL VHSQERELTQLREKLREGR 
DASRSLNEHLQAI*LTPDEPDKSQGQDLQEQLAEG CRLAQHLVQK 
I^PENDNDDDEDVQVEVAEKVQKSSAPREMPKAEEKEVPEDSLE 
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ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corre spending 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C-Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Histidine, Islsoleucine, K= Lysine , 
L= Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S= Serine , T=Threonine , V=Val ine , 
W= Tryptophan, Y=Tyrosine, X=Unknown , +=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 






• 


ECAITCSNSHGPYDSNQPHRKTKITFEEDKVDSTLIGSSSHVEW 
EDAVHI I PENESDDEEEEEKGPVSPRNLQES EEEEVPQES WDEG 
YSTl^IPPEhnJ^SYKSYSSTFHSLEEQQVCMAVDIGRHRWDQVK 
KEDHEATGPRLSREIJJJEKGPEVLQDSIJJRCYSTPSGCLELiTDS 
CQP YRSAFYVLEQQRVGLAVNMDE I EKYQEVEEDQDPS CPRLSR 
ELLDEKEPEVLQDSLGRCYSTPSGYIiEIiPDLGQPYSSAVYSLEE 
QYLGLALDVDR I KKDQEEEEDGGT PCPRI^RELLEVVE P E VLQD 
S LDRC YSTPS S CLEQPDS CQPYGSS FYALEE KHVGFS LDVGE I E 
KKGKGICKRRGRRSKKERRRGRKEGEEDQNPPCPRLSRELLDEKG 
PBVLQDSLDRCYSTPSGCLELTDSCQP YRSAFYI LEQQRVGLAV 
DMDEI EKYQEVEEDQDPSCPPJL^GELLDEKEPEVLQESLDRCYS 
TPSGCLELTDS CQPYRSAFYILEQQRVGLAVDMDEI EKYQEVEE 
DQDPS CPRLSRELLDEKEPEVLQDS lgrcystp sgylelpdlgq 
PYSSAVYSl^EQYLGLALDVDRIKJOQEEEEDQGPPCTRLSREL 
I1EWEPEVLQDSLDRCYSTPSSCLEQPDSCQPYGSSFYAI.EEKH 
VGFSLDVGEIEKKGKGKKRRGRRSKKERRRGRKEGEEDQNPPCP 
RliNSMLMEVEEPEVLQDSLDICYSTPSMYFELPDSFQHYRSVF^T 
SFEEEHISFALYVDNRFFTLTVTSIJILVFQMGVIFPQ 


6349 

■ 


3 

• 


3679 


AGAE KC F VTLLACFLAKQQNKYKYE ECKDL I KSMLRNELQFKEE 
KLAEQLKQAEEI^QYTWLVHSQERELTQI^EKLREGRDASRSLN 
EI ILQALLTPDEPDKSC<5QDLQEQLAEGCRLAQHLVQKLS PENDN 
DDDEDVQVEVAEKVQKSSSPREMQKAEEKEVPEDSLEECAITCS 
NSHGPCDSNQPHKNIKITFEEDEVNSTLWDRESSHDECQDA1.N 
ILPVPGPTSSATNVSMWSAGPLSGEKAAINILEINEKLRPQLA 
EKKQQ FRNLKBKCFLTQIJICFLANQQNKYKYEECKDLI KFMLRN 
ERQFKEEKLAEQLKQAEELRQYKVLVHSQERELTQLREKLREGR 
DASRSLNEHI^ALLTPDEPDKSOGQDIiQEQLAEGCRIAQHLVQK 
I^PENDNDDDBDVQVEVAEKVQKSSAPRBMPKAEEKEVPEDSLE 
ECAITCSNSHGPYDSNQPHRKTKITFEEDKVDSTblGSSSHVEW 
EDAVHI I PENESDDEEEEEKGPVS PRNLQES EEE E VPQES WDEG 
YSTLSIPPEMIiASYKSYSSTFHSLEEQQVCMAVDIGRHRWDQVK 
KEDHEATGPRLSREIiliDEKGPEVIjQDSIiDRCYS TPSGCLELTDS 
CQPYRSAFYVLEQQRVGLAVNMDEIEKYQEVEEDQDPSCPRLSR 
ELLDEKEPEVIiQDS LGRCYS TPSGYLEL PDLGQP YS SAVYSLEE 
QYLGLALDVDR I K303QEEEEDCGPPCPRI^REI»LEVVEPEVLQD 
SLDRCYSTPSSCLEQPDSCQPYGSSFYALEEKHVGFSLDVGE I E 
KKGKGKKRRGRRSKKERRRGRKEGEEDQNPPCPRLS RELLDEKG 
PEVIjQDS LDRCY S TPSGCLELTDS CQP YRSAFYI LEQQRVGLAV 
DMDEIEKYQEVEEDQDPSCPRLSGELLDEKEPEVLQESLDRCYS 
TPSGCLELTDSCQP YRSAFYI LEQQRVGLAVDMDE I EKYQEVEE 
DQDPSCPRLSRELLDEKEPEVLQDSLGRCYSTPSGYLELPDLGQ 
PYSSAVYSLEEQYI^IJU^VDRIKKDQEEEEDQGPPCPRLSREL 
I^VVEPEVLQDSLDRCYSTPSSOjEQPDS^PYGSSFYALEEKH 
VGFSLDVGE I EKKG KG KKRRGRRSKKERRRGRKEGEEDQNP PCP 
RIiNSMI»MEVEEPEVLQDSLDICYSTPSMYFELPDSFQHYRSVFY 
S FEEEH I S FALYVDNRFFTLTVTSLHLVFQMGVI FPQ 


6350 


3 

» 


3679 

* 


AGAEKCFVTLLACFLAKQQNKYKYEECKDLI ksmlrnelqfkee 
KLAEQLKQAEELRQYKVLVHSQERELTQLREKLREGRDASRSLN 
EHI*QAIiTPDEPDKSO^QDLQEQLAEGCRLAQHLVQKLSPENDN 
DDDEDVQVEVAEKVQKSS S PREMQKAEEKEVPEDSLEECAITCS 
NSHGPCDSNQPHKNI KITFEEDEVNSTLWDRESSHDECQDALN 
ILPVPGPTSSATNVSMVVSAGPLSGEKAAINILEINEKLRPQLA 
EKKQQ FRNLKEKCFLTQLACFLANQQNKYKYEECKDLI KFMLRN 
ERQ FKEEKI^QLKQAEELRQYKVLVHSQERELTQLREKLREGR 
DASRSI^EHLQAIjLTPDEPDKSQGQDIiQEQLAEGCRLAQHLVQK 
LS PENDNDDDEDVQVEVAEKVQKS SAPREMP KAEEKEVPEDSLE 



491 



BNSDOCID: <WO. 



0153312A1J_> 



WO 01/53312 



PCTAJSOO/34263 



SEQ 
ID 

NO: 


Predicted, 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


j Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A= Alanine , C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=L*ysine, 
I>=Iieucine, M=Methionine, N=Asparagine, 
P= Proline, Q=Glutamine, R^Arginine, 
S-Serine, T= Threonine, V=Valine, 
W=Tryptophan , Y=Tyrosine, X=Unknown, *=Stop 
Codon, /-possible" nucleotide deletion, 
\=possible nucleotide insertion) 




• 




ECAITCSNSHGPYDSNQPHRKTKITFEEDKVDSTLIGSSSHVEW 
EDAVHIIPENESDDEEEEEKGPVSPRNIiQESEEEEVPQESWDEG 
YSTLS I PPEMLASYKSYSSTFHSLEEQQVOIAVDIGRHRWDQVK 
KEDHEATG PRLSRELLDEKGPEVLQDSLDRCYSTPSGCLEI.TDS 
CQPYRSAFYVLEQQRVGLAVNMDEIEKYQEVEEDQDPSCPRiSR 

QYLGLALDVDRI KKDQEEEEDQGPPCPRIiSREKLEVVEPEVIiQD 
SLDRCYSTPSSCLEQPDSCQPYGSSFYALEEKHVG FS LDVGE IE 
KKGKGKKRRGRRS KKERRRGR KEGEEDQN P PCPRLSREI*LDEKG 
PEVLQDSLDRCYSTPSGCLELTDS<^PYRSAFYILEQQRVGI*AV 
DMDEIEKYQEVBEDQDPSCPRLSGELLDEKEPEVLQESLDRCYS 
TPSGCLELTDS CQP YRS APY I LEQQRVGLAVDMDE I E KYQEVEE 
DQDPSCPRLSRJBLLDEKEPEVIKJDSI^RCYSTPSGYLBLPDIJSQ 

LE WEPE VLQDSLDRC YSTPS S C1»EQPDS CQPYGS S FYAI*EEKH 
VGFSLDVGEIEKKGKGKKRRGRRSKKERRRGRKEGEEDQNPPCP 
RLNSMLMEVEEPEVLQDSLDI C YSTPSM YFELPDS FQHYRS VFY 
SFEEEH IS FAIiYVDNRFFTLTVTSLHLVFQMGVIFPQ 


6351 


1291 


319 


REARRRTEiiSQIjGRMLWEVANGRSI>VWGAEAVQALRERIiGVGG 

K A V bnurnu irKyjri orCJ-ro Ltt'LtLiLtV^i rri-farVK 1 Uiftfillaflv A JL»Vi>APRP 

DSRHHSLALTSFKRQQEESFQEQSALAAEARETRROELLEKITE 
GQAAKKQKLEQASGASS S QEAGS S QAAKEDE TS DGQASGEQK EA 
GPSSSQAGPSNGVAPI,PRSAIjI>VQIATARPRPVKARPLDX7RVQS 
KDWPHAGRPAHELRYS IYRDLWERGFFTiSAAGKFGGDFLVYPGD 

DT.DPU1HVT anrUBOPnTTDT/MM tr?V / 1 1*C\70lf , PT T T rctin 

PDGKWYTSLQWASLQ 


6352 


235 


923 


WSBVJLSPCHAAKCKGLSMbRITMKTRAISIAADATEFVQGRSAP 
AMARSDVHDTVFYCIjSVYQVKISPTPQIiGAASSAE^ 
LMGN>1NPEGGVNHEKGMNRDGGMI PEGGGGNOEPROOPOP PPEE 
PAQAAMEGPQPENMQPRTRRTKFTLLQVEELESVFRHTQY PDVP 
TRRELAE^njGVTEDKVRVWFKNKRARCRRHQRF 
DDCVYIWD 


6353 


65 


672 


RFAGAGAI PEARARPPDVQAAEEEKEMDL.PDSASRVFCGRI bSM 
VNTDDVNAI IliAQKNMLDRFEKTNEMT »T »NFNNIjS S ARLQQMS ER 
FLHHTRTLVEMKRDLDS I FRRI RTLKGKLiARQHPEAFSH I PEAS 
FltEEEDEDP I PPSTTTT IATS SQSTGS CDTSPDTVSPSLS PGFE 
DLSHVQPGS PAINGRSQTDDEEMTGE 


6354 


965 


510 


P SLRPMEPTRD CPLFGGAFSAI LPMGAIDVSDLRPVPDNQEVFC 
HPVTDQS LI VELLE1X5AHTOGEAAARYHFEDVGGVQGARAVHVE 
S VQ PI*S LENLALRGRCQEA WVLSGKQQ I AKENQQVAKDVTLHQA 
LLRLPQYQTDLI#LTFNQPP 


6355" ■ 


158 


1662 


RGSSAAFRGSGI^GAMIRRVI,PHGMGRGL.LTRRPGTRRGGFSIjD 
WDGKVSE I KKKI KS I LPGRS CDLLQDTS HLP PEHSDVVI VGGGV 
LGIiSVAYITLKKLES RRGAIRVTiWERDHTYS QAS TGL.S VGG I CQ 
QFSLPENI QLSLFSAS FLRN INEYLAWDAPPLDLRFNPSGYLL 
IASEKDAAAMESNVKVQRQEGAKVSLMS PDQLRNKFPWINTEGV 
ALASYGMEDEGWF1>PWCLIX^I*RRKVQSIjGVLiFCQGEVTRFV s 
SQRMLTTODKAVVLKRIHBVHVKMDRSLEYQPVECAIVINAAGA 
WSAQ 17UUAGVGEGPPGTLQGTKLPVEPRKRYVYVWHCPQGPGL 
ETPLVADTSGAYFRREGLGSNYIA3GRS PTEQEEPDPANLEVDHD 
FFQDKVWPHIjAIiRVPAFETIJCVQSAWAG YYDYNT FDQNGWG PH 
PLWNM Y FATGFSGHGXjQQAPGI GRAVAEMVLKGR FQT IDLS PF 
LFTRFYLGEKIQENNI I 


6356 


354 


633 


TGLTSSCLPLQVMMTKRTKDMGKFSSVTVSTIDEEEEEIEAREV 
ADS YAQNAKV I EKQLE RKGMS KRRLQEIiAET*K AKKAKMKGTL.ID 
NQFK 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corre soondina 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine , D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G^Glycine, 
H-Histidine, I=Isoleucine, K= Lysine, 

P=Proline, Q=Glut amine, R=Arginine, 
S=Serine , T=Threonine, VsValine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, +=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 


6357 


2 


915 


GLLRNMALLVRVlxRNQTS ISQWVP VCSRL I PVSPTQGQGDRALS 
RTS QWPQMS QSQACGGSEQIPG I DIQI/NRKYHTTRKLSTTKDS P 

TDFEEFFX,RCQMP0TFNSWFLITTiLHWMCLTOMKQEGl^GK^ 
CRI IVHFMW EDVQQRGRVMGVN P Y I LKKNM ILMTNHF YAAILG Y 
DEG ILSDDHGIiAAALWRTFPlTOKCEDPRKLELLVEYVRKQIQYI. 
DSMNGEDLLLTGEVSWRPLVEKNPQSILKPHSPTYNDEGI* 


6358 


2009 

* 


1040 


AS DALHSLS AP VLRLS S RSAARPATMTEQAI S FAKD F LAGG IAA 
AI SKTAVAP IER VKLLLQVQHAS KQ I AADKQYKG XVDCI VRI P K 
EQGVLS FWRGNLANVIRYFPTQA1»NFAFKDKYKQ1 FI.GGVDKHT 

^PUPVPaf MT Re/V* Tvr^7\*T*OT /~ , XT\7Vr>T TvOti dtdt 7\ t\ tyi r /*• V C* 1 *M' c 

\J r W K X r AlwfljAduiaA/UiaAXoiA. c V ZlruUlfiUiTKLkAiUjVvaK^bxJa 
RE FRGLGDdiVKI TKSDG I RGLYQG FSVSVQG 1 1 IYRAAYFGVY 
DTAKG MLPD PKNTH I WS WMIAQT VTAVAG WS Y PFDTVR RRMM 

GAFVLVLYDELKKVX 


6359 


98 


1086 


VCRQEEEKMKEDCliPSSHVPI SDSKSI QKSELLGIxbKTYNCYHE 
GKS FQLRHREEEGTLI 1 EGLLN1 AWGbRRPIRbQMQDDREOVHlj 
PSTSWMPRRPSCPLiKBPS PQNGNITAQGPS IQPVHKAESSTDSS 
GPtiEEAEEAPQLMRTKSDASCMSQRRPKCRAPGEAQR I RRMRFS 
INGHFYNHKTS VFTPAYG S VT1WR VNSTMTTLQVLTLZiIjNKFRV 
EDGPSEFALYIVHESGERTKLKDCEYPL1SRILHGPCEKIARIF 
l^EADLGVEWHEVAQYIKFEMPVXDSFVEKLKEEEEREIIKLT 
MKFQALRLTMLQRIiEQIiVEAK 


6360 


1 


345 


GTRGAVPSTLEEVVXiPPRSCRVFWIHSGTTMSKVS FKITL.TSDP i 
RLP YKVLSVPESTPFTAVI»KFAAEE FKVPAATS AI ITNDG IGIN 
PAQTAGNVFLKHGSELRI I PRDRVGSC 


6361 


615 


158 


RPGLGQLQHCALAPQAGNRRCRFHGRLHALTRSTHRGKPMS IMQ 
r KUl JbrJ 1 rLrUDoPVAVrljoAi''lAVAo X Jbo VliiiNLAjVisiijl WAt- 
APGRWRRQITSQEFCHFIQGRCTFTPDIX5ETLHIQAGEALMLPA 
NSTGIWDIQBTVRKTYVLIL 


6362 


350 


1576 

* 


TTMDGSHSAALKIjQQIiP PTS S SS AVS EAS FS YKENLIGALLA I F 
GHLWS IAI^^KYCTIRIjAGSKDPRAYFKTKTWWI^LFLMLI^ 
ELGVFASYAFAPLSLIVPLSAVSVIASAIIGIIFIKEKWKPKDF 
LRRYVLS FVGCGLAWGTYIxLVTFAPNSHEKMTG envtrhl vs w 
PFLLYMLVEI ILFCLLIiYFYXEKNANNI WILLLVALU3SMTW 
TVKAVAGMLVLS I QGNLQLD YP I FYVM FVCMVATAVYQAAFLSQ 

2\.<;OMVn Q T .TJX ^ VfTVT T •*? TT T A TT Af? A T F*VT .T1PT CJFDVLtH T CM F 

ALGCL17*FTjGVFLITllNRiaC?IPFEPYISMl^ 

TVQ PELKASFS YGAIiENNDN I S EI YAPATLPVMQEEHGS RS ASG 

VPYRVLEHTKKB 


6363 


21 


1201 


RRTRI*GSSFPRRRDSSAMESYDVIANQPVVIDNGSGVIKAGFAG 
lXJIPKYCFPNYVGRPKHVRVMAGALEGDIFIGPKAEEHRGI^I^S I 
R YPMEHG XVKD WNDMERI WQYVYS KT^QLG/TFSEEHPVtiLTE APL 
NPRKNRERAAEVFFETF1WPALFISMQAVLSLYATGRTTGVVLD 
SGDGVTHAVPI YEGFAMPHS I MRIDIAGRDVSRFI*RI»YLRJCEGY 
DFHSSSEFETVKAI KERACYI*S INPQKDETLETEKAQYYI.PDGS 
TIEIGPSR FRAPELLFRPDLI GEESEG IHEVL VFAI QKSDMDLR 
RT^FSNIVLSGGSTLFTCGFGDRLLSEVKKLAPKDVKI RISAPQE 
RliYSTWIGGSILASLDTFKJO^WSKKEYEEDGARS IHRKTF 


6364 


21 


1201 


RRTRLGSS FPRRRDS SAME S YDVI ANQ P WI CttJGSG VI KAGFAG 
DQ I P KYCF PNYVGRP KHVRVMAG ALEGD I F I G P KAE EHRGUL.S I 
RYPMEHGIVKDWNDMERI WQYVYS KDQLQTFSEEHPVLLTEAPL 
NPRKNRERAAEVFFETFNVPALFISMQAVLS LYATGRTTGVVIaD 
SGDGVTHAVPI YEG FAMPHS IMRID IAGRDVSRFLRLYLRKEG Y 
DFHSSSEFEIVKAI KERACYLS INPQKDETI*ETE KAQYYLPDGS 
TIEIGPSRFRAPEIXFRPDLIGEESEGIHEVLVFAIQKSDMDLR 
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ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
<A=Alanine, C-Cysteine, D=Aspartic Acid, E= 
Clutamic Acid, F- Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K^Lysine, 
I*=lieucine , M=Methionine , N=Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S =S e r ine , T=Threon ine, V= Val ine , 
WsTryptophan, Y=Tyrosine, X= Unknown, *-Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








RTI>FSN I VIiSGGSTL FKGFGDRUUS EVKKLAP KDVKI R I SAPQE 
RL YSTW IGGS IIASLDTFKKMWVS KKEYEEDGARSIHRKTF 


6365 


234 


1939 


KHKSRASCAARAQAFGPSREREVHSRFRSGIJUUjGESNSGCCTM 
ASMGTLAFDE YGRPFL 1 1 KBQDPJCSRIJ^LEAIJCSHIKAAKAVA 
NTMRTS DG PNG LDKKMVD KDGDVTVTNDGAT I LS MMDVDHQ IAK 
LMVELS KSQDDE I GIXJTTGVVVLAGALLiEEAEQLiIiDRG IHP I RI 
ADGYEQAARVAI EHLDKI 5DSVLVDI KDTEPLIQTAKTTLGSKV 
VKSCHRQMAE IAVNAVLTVADMERRDVDFELI KVEGKVGGRLED 
TKLIKGVTVDKDFSHPQMPKKVEDAK1AILTCPFEPPKPKTKHK 
LDVTSVEDYKALQKYEKF.KFEEMIQQIKETGANXAICQWGFDDE 
ANHLXiLQNNL P AVRW VGG P E I El* IAI ATGGRI V P RFS ELT AE KL» 
GFAGI*VQE I S FGTTKDKMLVIEQCKNS RAVTI F I RGGNKMI IEE 
AKRS LHDALCVIRNLIRDNR WYGGGAAEIS CALAVSQEADKCP 
TLEQYAMRAFADALEVI P MAIDENS G MNP I QTMTEVRARQVKEM 
NPALG I DCLH KGTNDMKQQHV IETL IG KKQQ I SLATQMVRM ILK 
IDD1RKPGESEE 


6366 


257 


1898 


GNKEGAHSSTFWVL.LSI FLGAVAMIiCKEQGITVl^LNAVFDILV 
IGKFNVliE I VQ KVl^KDKSIJE^IIlGMlJiNGGLLFRMTLLTS GGAG 
MLYVRWR.IMGTG PPAFTEVDNPAS FADSMLVRAVNYNYYYSLNA 
WLIJ^CPWWLCFDWSMGCIPLIKSISDWR\^AIAALWFCLIGLIC 

QALCSEDGHKRR I LiTLGLG F LVX P FL PASNLFFR VG FWAERVXi 
YLPSVGYC^LTFGFGAIjSKHTKKKKIjIAAVVIiGILFINTLRCV 
IJfcSGEWRSEEQIiFRSALSVCPIjNAKVHYNIGKNLA^ 
RYYREAVRLNP KY VHAMNNLGN I LKERNELQEAEELLS LAVQIQ 
PDFAAAWMNLG rVQNSLKRFEAAEQS YRTAI KHRRKYPDCYYNL 
GRLYADLNRHVDAI^AWRNATVLKPEHS LAWNNM I ILLDNTGNL 
AQAEAVGREALEL I PNDHSIJ4FSLANVLGKSQKYKESEALFLKA 
I BCANPNAAS YHGN1AVL YHRWGHLDIiAKKHYEI SI*QIjDPTASGT 
KENYGLLRRKLELMQKKAV 


6367 


287 


1934 


S IGFP VML.VLSI LLYTCEMFQDSVAFEDYAVSFTQEEWAbLDPS ~ 

QKNIiYTU5VMQETFKNLTS VGKTWKVQN I EDEYKNPRRNI»55IiMR E 

KL»C££ KES HHCG ES FNQIADDMLNRKTLPG I TPCESS VCGEVGT 

GHSSLNTHIRADTGHKSSBYQEYGENPYRNKECKKAFSYLDSFQ 

SHDKACTKEKP YDGKECTETF I SHSCI QRHRVMHS GDGP YKCKF 

CGKAPYFLNIjCIiIHEJIIHTGVKPYKCKQCGXAFTRSTTLPVHER 

raTGTVNADECKECGNAFSFPSEIPJ^KRSHTGEKPYECKQCGKV 

FISFSSIQYHKMTHTGEKPYECKQCX3KAFRCGSHLQKHGRTHTG 

EKPYECRQCGKAFRCTSDLQRHEKTHTEDKPYGCKQCGKGFRCA 

SQLQIHERTHSGEKPHECiCECGKVFKYFSSLiaHERTHTGEKPH 

ECKQCGKAFRY FS S LHIHERTHTGDKP YE CKVCG KAFTCSSS IR 
YHERTHTGEKPYECKHCGKAF ISNYXRYHERTHTGEKPYQCKQC 

GKAFIRASSCREHERTHTINR 


6368 


1 


327 


RPVPAKLNPRSWPRTAGAIiPLRP PPLTMAV FHDEVE I EDFQYDE 
DSETYFYPCPCGDNFS 1 1 KEDLENGEU VATCPSCSI*! IKVIiDK 
DQFVCGETVPAPSANKELVKC 


6369 


1 


1745 


AGCCRDTRF PTPRG PGS LCHN F CRSAACT VYRT IHG S PREDTGT 
PRSREMMFQDSVAFE DVAVS FTQEEWALLDPSQKNLYRDVMQET 
PKNLTSVGKTWKVQNIEDEYKNPRRNLSLMREKLCESKESHHCX3 
ESFNQIADDMLNRKTLPGITPCESSVCGEVGTGHSSLNTH1RAI) 
TGHKSSEYQEYGENPYRNKECKKAFSYLDSFQSHDKACTKEKPY 
DGKECTBTF I SHSCI QRHRVMHSGDGP YKCKFCGKAFYFLNLCIi 
IHER IHTGVKP YKCKQCGKAFTRS TTLPVHERTHTGVHADECKE 
CGNAFSFPSEIRRHKRSHTGEKPYECKQCGKVFISFSSIQYHKM 
THTGEKPYECKQCX3KAFR0GSHLQKHGRTHTGEKPYECRQCGKA 
FRCTSDLQRHEKTHTEDKPYGCKQCGKGFRCASQLQIHERTHSG 
EKPHECiCECGKVFKYFSSLRIHERTHTGEKPHECKQCGKAFRYF 
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ID 

NO: 


beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


i Predicted t^nd 

nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 

• 


■run j uu acia segment concaining sigucl peptlae 
(A= Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenyialanine, G=Glycine, 
H=Histidine, I=Isoleucine # K= Lysine, 
L= Leucine, M=Methionine, N^=Asparagine , 
P=Proline, Q~Glutamine, R=Arginine, 
S^Serine, T»Threonine, V=»valine, 

Tryptophan, Y-Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








SSI^IHBRTHTGDKPYECKVCGKAFTCSSSIRYHERTHTGEKPY 
HERTHTINR 


6370 


1711 


329 


FVLS EQRLRTERTW PRS PGliGRGAAAAGARTAGAGI^LRLLLGCG 

IAVSPRSLHSELMCPICIJDI^KrnTTI^KECIiH^ 
SGinCECPTCRKKLVSKRSLRPDPNFDAIiISKIYPSREEYEAHQD 
RV1»I RLS RLHNQQALS SSI EEGLRMQAMHRAQR VRR P I PGSDQT 
TTMSGGEGEPGEGEGDGEDVSSDSAPDSAPG PAPKRPRGGGAGG 
S S VGTGGGGTGG VGGGAG S EDS GDRGGT LGGGTLGP PSP PGAPS 
PPEPGGEIELVFRPHPLLVEKGEYCX2TRYV 

! LALR IALERRQQQEAGEPGGPGGG AS DTGG PDG CGGEGGGAGGG 
DGPEEPAI>PSI*EGV S EKQ YT IY IAPGGGAFTTLNGSLTLJBliVNE 
KFWKVSRPLELCYAPTKDPK 


6371 


3 


268 


GVANMSTAMNFGTKSFQPRPPDKGSFPLDKLGECKSFKEKFMKC 
LHNNETFENALCRKESKEYLECRMERXI^ 

Yrr* v^^k m ^rp 

KSEAKK 


63 72 


2141 


625 


R VSAI ASEG KAE ER YKKLiEDIiLEKS FS I*VKM PS LQPWMCVMKH 
LPK\^EKKLKI»VMADKE1jYRACAVEVRRQ I WQDNQALFGDEVSP 
LLKQY I LEKES AL>FS TEIiSVIjHNFFS PSPKTRRQGEWQRliTRM 
VGKNVTCLYDMVXiQFLRTLFTjRTRITV^ 

EICTVDPCHKFTWC1jDACIP^RFVI3SKRARELC^FLIX5VKK^ 
QVIXSDLSMILC^PFAil^LAI^TVRHI^EXVGQETL^RDSPDIJj 
LLLRLIALGQGAWDMIDSQVFKE P KME VEL I TRFLPMLMS FLVD 
DYTFNVDQKLPAEEKAPVS YPNTLPES FTKFI»QEQRMACEVGI»Y i 
YVLHITKQRNKNALLRLLPGLVETFGDLAFGBI FLHLLTGNIiAL 
LADEFALEDFCS SLFDGFFI/TAS PRKENVHRHALRLLIHLHPRV 
APS KLEALQKALEPTGQSGEAVKEL YSQLGEKLEQLDHRKPS PA 
QAAETPALELPLPSVPAPAPL 


6373 


67 


711 


PSRAARAS PARLPAMVSWI ISRLWLI FGTLYPAYYS YKAVKS K 
DIKEYVKWMMYW I IFALFTTAETFTDI FLCMFPFYYELKXAFVA 
WLLS P YTKGSSLI*YRKFVHPTLS S KEKE IDDCLVQAKDR S YDAL 
VHFGKRGLNVAATAAVMAAS KGQGALSERLRSFSMQDLTT IRGD 


6374 


535 


2105 


HKLFCS YISTSEFPSSTRHHSCPTHTFCNYTSST I FLSSTRDHS 
CPTHTFCmTSSTIFLSSTRDHSCPTHTSCNYTSSTIFLSSTRD 
HSCPlin^CNYTSSTIFr^STRDHSCPTHTFCNYPRPIlRLSSC 
CPAEIX^TEGSNGKKB VLSGFQVVLEDTVLFPEGGGQPDDRGT IN 
DISVLRVTPJRGEQADHFTQTPLDPGSQVIjVRVDWERRFDHMQQH 
SGQHL I TAVADHLFKLKTTS WELGRFRS Al ELDTPSMTAEQVAA 

ieosvnekipjjpxpvnvrelslddpeveqvsgrglpddhagp IR 
WN I EGVDSNMCCGTHVSNI*SDLQVT kilgtekgkknrtnlifl 
SGNRVLKWMERSHGTEKALTALLKCGAEDHVEAVKKT.QNSTKIL 
QKNNLNLLREIAVHIAHSLRNSPDWG^WILHRKEGDSEFMNI I 
ANEIGSEETIiLFLWGDEKGGGLFLIjAGPPASVETIX5PRVAEVI> 

egkgagkkgrfqgkatkmsrrmeaqallqdyistqsake 


6375 


1 


1535 

• 


aimaaatrpvrlpeagcegrercwnpsrsrshsgegglaawsrt 
cpgrprrpgqqvvrg ptmiivtayliafvgirlascxgiiels rcrak 
ppgracsnpsfijlfqldfyqvyfiialaadwi^qapyiiyklyqx^ 
flegqiai lwtoiiastviifglvassi»vdwlgrkns cvlfslt y 
slccltki^qdyfvlivgrajlgglstaiilfsafeawy ihehver 
hd fpaewi pat faraafwnhvlavvagvaaeavas w iglgpvap 

FVAAI PLLAI»AGA1JCjRNWGENYT>RQRAFSRTCAGGLRCLXiSDR 
RVLLLGTIQALFESVI FIFVFLWTPVUJPHGAPLGI I FSSFMAA 
SlJ^SLYRIATSKRYHlJQP^IHI^LAVI,IVVFSIiFMLTFSTSP 
GQESPVESFIAFI^IBIACGLYFPSMSFIiRRKVTPETEQAGVLN 
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ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 

nucleotide 

location 

c orr esponding 

to first 

amino acid 

residue of 

amino acid 

sequence 


Amino acid segment containing signal peptide 
(A^ Alanine , C-Cysteine, D^Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G-Glycine # 
H=Histidine, I=Isoleucine, K=Lysine , 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T= Threonine, V=Valine, 
W=Tryptophan , Y=Tyrosine, X=Unknown, *»Stop 
Codon, /sspossible nucleotide deletion, 
\=possible nucleotide insertion) 








W FRVP1»HS LACliG L»L»VLiHDS D RKTGTRNMFS I CSA VM VMALLA V 
VGLFTVVRHDAELRVPS PTEEP YAPEL 


6376 


380 


1437 


ISSTDIDHYRFSFLWSKMPSKESWSGRKTNRAAVHKSKQEGRQ 
QDU^IAALGMKLGS PKSSVT I WQPLKLFAYSQI/TS I*VRRATT»KE 
NEQI PKYEKIHNFKVHTFRGPHWCBYCANFMWGL1AQGVKC1ADC 
GLNVHKQCS KMVPNDCKPDLKHVKKVY S CDLTTLVKAHTTKRPM 
WDMC I RE I BSRGLNSEGLYR VSGFS Db IEDVKMAFDRDGEKAD 
ISVNKYEDINXITGALKLYFRDLPIPLITYDAYPKFIESAKIMD 
PDEQLETLHEAIjICLLP PAHCETLRYIiMAHLKRVTIiHEKENIiMNA 
ENLGIVFGPTLMRS PELDAMAALNDIRYQRLWELLI KNEDILF 


6377 


2311 


1845 


S RI RRRS SRR PRE P PGPSRRRRRRRPDPRTMPSEKTFKQRRT FE 
QRVEDVRLI REQHPTKI PVI lERYKGEKQLPVLDKTKFLVPDHV 
NMSELIKII RRRLQLNANQ AF FLLVNGHS MVS VST PIS EVYES E 
KDEDGFLYMVYASQETFGMKLSV 


6378 


686 


191 


GAGPWEAFPDGIGRRSRRARI>PQYKRPPGRVGGGDSGRRNMAVA 
DLALIPDVDIDSDGVFKYVLIRVHSAPRSGAPAAESKBIVRGYK 
WAEYHADIYDKVSGDMQKQGCDCECLGGGR I SHQSQDKKIHVYG 
YSMAYGPAQHAISTEKI KAKYPDYEVTWANDGY 


6379 


35 


378 


ERAG S PS PSRAAIjRRCAPQRSQAPRWPDRAACRRS FQGSQGRAY 
LFNS WNVGCG PAEERVLLTGIiHAVADI YCENCKTTLG WKYEHA 
FESSQKYKEGKYI IEIiAHMI KDNGWD 


• 6380 


1414 


462 


PAVQGQRGAGPPTGRGSGNMARFAIaTWRHGETRFNKEKl IQGQ 
GVDE PI>S ETG FKQAAAAG I FXNNVKFTHAFS SDLMRTKQTMHGI 
LERSKFCKDMTVKYDSRLRERKYGVVEGKALSEIjRAMAKAAREE 
CPVFT PPGG ETL.DQ VKMRG I D P FE FL.CQL I LKEADQ KEQ FS QG S 
PSNCIiETS LAE I FPLGKNHSSKVNSDSGIPGLAAS VLWSHGAY 
MRSIjFDYFLTPLKCSIjPATLSRSEIiMSVTPNTGMSLFIINFEEG 
REVKPTVQC I CMNLQDHLNGIiTENS LGLNLPS KSNHFE PI>KGVP 
LALFTSLLC 


6381 


1668 


218 


awraqgsrgfsgagwrprqaaamnfsevfklssllckfspdgk 

YIiAS CVQ YRI> WRD VNTLQ ILQLYTCLDQIQHIEWSADSLFILC 
AM YKRGLVQVWSLEQPEWHCKIDEGS AGLVASCWS PDGRH I IiNT 
TEFHLRITVWSLCTKS^SYIKYPKACUK5ITFTRIX;RYMALAER 
RDC^YVSIFVCSDWQIJjRHFDTOTQDLTGIF^APNGCVI^VWD 
TCIjB YKI IxLYS IjDGRIiIiSTYS AYEWS LG I KS VAWS PSSQ FLAVG 
SYDGKVRILNHVTWKMITEFGHPAAINDPKIVVYKEAEKSPQIjG 
UGChS FPPPRAGAGPLPSSESKYEIASVPVSLQTLKPVTDRANP 
KIGIGMLAFSPDSYFIATRNDNIPNAVWVWDIQKLRIjFAVLEQL 

spvrafqwdpqcpriaictggsrlylwspagcmsvqvpgegdfa 
vi^lcwhi^gdsmaij^kdhfclcfleteavvgtacrqlgght 


6382 


2 


1062 


feededrniicl iaypucgdhg ivdi vdnsdcepksk1.lrwttnk 
khhvletektpkdwvrqhrkeekmkshkleeefewlkksevlyy 
tvekkgn issqlkhynpwsmkchqqqlqrmkenaxhrnqykfil 
leniits ryevpcvldlkmgtrqhgddaseekaanq irkcqqsts 
avigwvcgmqvyqagsgqimfmnkyhgrklsvqgfkealfqff 

RKGRYLRRELLG P VIiKIQiTEIjKAVIiERQESYRFYS S SLLVI YDG 

kerpewldsdaedledlseesadesagayaykpigassvdvrm 
i dfahttcrlygedtvvhegqdagyi fglqslidi vteiseesg 

E 


6383 


3159 j 


1061 


S PAPGRPS PHGSQPAARAAAAPAMPSAKQRGSKGGHGAASPS ek 
GAHPSAARPIiAAPTPAAPACRS PS PGGAPAS FPGRAPRS liASQP 
AARAAAAPAMPSAKQRGSKGGHGAASPSEKGAHPSGGADDVAKK 
PPPAPQQPPPPPAPHPOQHPQQHPQNQAHGKGGHRGGGGGGGKS 
SSSSSASAAAAAAAASSSASCSRRLGRALNFLFYLALVAAAAFS 
GW CVHHVL E E VQQ VRRSHQDFS RQREELGQG LQGVEQKVQS LQA 
TFGTFESILRSSQHKQDLTEKAVKQGESEVSRISEVLQKLQNEI 
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beginning 
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location 
corr e sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A= Alanine, C«Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=I*ysine, 
L=Leucine, M=Methi6nine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine # 
S=Serine, T=Threonine, V**Valine, 
W=Tryptophan, Y=Tyrosine, X= Unknown , *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








LKDI>SDGIHVVKDARSRDFTSLENTVEERLTEXTKS INDNIAI F 
TEVQKRSQKEINDMKAKVASLEESEGNKQDLKAIiKEAVKEIQTS 
AKSRE WDMEALRS TX»QTMESD I YTEVRELVS LKQEQQAFKEAAD 
TERLAI^AIjTEKIjIxRSEESVSRIjPEEIRRIjEEEIjRQLKSDSHGP 
KEDGGFRHSEAFEALQQKSQGLDSRIiQHVEDGVLSMQVASARQT 
ESLBSIXSKSQEHEQRIiAAI^RIiEGXjGSSEADQDGIiASTVRSl. 
GETQLVLYGDVEEIjKRS VGELP STVESLQKVQEQVHTIiIjSQDQA 
QAARIiP PQDFLDRIiS S LiDNLKAS VSQVEADLKMLRTAVDSLVAY 
S VKI ETNENNLES AKG LlLDDLRNDLDRL FV KVE KI HE KV 


6384 


738 


1904 


IWEVPVCLTH1»LHLQQANQPL>PPPSSSXNEEDADEANRAIGEKR 
AAPDSGKKPKTPKTKQQKDPNEPQKPVSAYAI*FPRDTQAAIKGQ 
NPNATFGEVSQIVASMWDSLGEEQKQVYKRKTEAAKKEYIjKALA 
AYRASLVS KAAAESAEAQTIRS VQQTIiAS TNI/TS S LLLNTPLSQ 
HGTVSAS PQTLxQQSIi PRSIAPKPLTMRIi PMNQ IVTS VT I AANMP 
SN I GAP L I SSMGTTMVG SAPSTQVSPSVQTQQHQMQIjQQQQQQQ 
QQQMQQMQQQQIjQQHQMHQQ I QQQMQQQHFQHHMQQHIiQQQQQH 
LQQQINQCMUXXiLQQRIiQLQQI'QHMQHQSQPS PRQHS PVASQI 
TS P I PAIGS PQPASQQHQSQIQSQTQTQVLSQVSI F 


6385 


2 


1584 


PRVRAADVAAGAQAWSAGMAKSNGENGPRAPAAGE3LSGTRES 
IAOGPDAATTDEIiSSI^SDSEANGFAERRIDKFGFIVGSQGAEG 
AI*EE\^LEVLRQRESKWI*DMIJnWDKWMAKKHKKIR^ 
PSLRGRAWQ YIiSGGKVKXrQQNPG KFDEIjDMS PGDP KWIiDVI ERD 
LHRQF PFHEM FVSRGGHGQQDLFRVIiKAYTL YRPEEG YCQAQAP 
lAAVLI^HMPAEQAFWCLVQICEKYIiPGYYSEKLEAIQLDGEIL 
FSIiI*QKVS P VAHKHliSRQKI DPLL YMTEWFMCAFS RTL PWS S VL 
RWDMFFCEGVKIIFRVGLVIJaKHAIiGSPEKVKACGGQYETIER 
LRSLS PKIMQEAFLVQEVVELPVTERQIEREHIjIQIiRRWQETRG 
EI^CRSPPRLHGtfU^AILDAEPGPRPAIiQPSPSIRIjPLDAPLPGS 
KAKPKPPKQAQKEQRKQMKGRGQLEKPPAPNQAMWAAAGDACP 
PQHVP P KDS AP KDS APQDLAPQVSAHHRSQES L>TS QES EDTYL 


6386 


819 


195 


TVCGS F YLG I MQRASRLKREIjHMIATEPPPGI TCWQDKDQMDDL 
RAQ I LGG ANTP YEKGVFKLEVI I PER YPFEP PQ I RFIjTP I YHPN 
IDSAGRICLDVLKLPPKGAWRPSLNIATVLTS IQLLMSEPNPDD 
PLMADISSEFKYNKPAFLKNARQWTEKHARQKQKAJDEEEMLDNIj 
PEAGDSRVHNS TQKRKASQLVGI EKKFHPDV 


6387 


1 


662 


PGPTHASADAWADAWAQPNMAMHNKAAPPQIPDTRRELAELVKR 

kqelaetlanlerqiyafegsyledtqmygni irgwdryltnqk 
nsnskndrrnrkfkeaerlfskssvtsaaavsalagvqdqliek 
repgsgtesdtspdfhnqenepsqedpedldgsvqgvkpqkaas 

STSSGSHHSSHKKRKNKNRHSPSGMFDYDFEIDLKEjNKKPRADY 


6388 


1 


662 


PGPTHAS ADAWADAWAQPNMAMHNKAAPPQI PDTRRELiAEliVKR 
KQELAETLiANIiERQ I YAFEGS YLBDTQM YGNT IRGWDR YIiTNQK 
NSNS KNDRRNRKF KEAERLFS KS S VTS AAAVSALAGVQDQLI EK 
REPGSGTESDTSPDFHNQENEPSQEDPEDLDGSVQGVKPQKAAS 
STSSGSHHSSHKKRKNKNRHSPSGMFDYDFEIDLKLMKKPRADY 


6389 


1074 


497 


AEPGDRMAGHRLVLVLGDLHI PHRCNSLPAKFKKIiLVPGKIQHI 
LCTGNI>CTKES YDYLJCTIJVGDVH I VRGD 

QFKIGLIHGHQVI PWGDMASXADLQRQFDVDI I* I SGHTHKPEAF 
EHENKFY I NPGSATGAYNAliETN I IPS F VLMDI QAS TVVTY VYQ 
LIGDDVKVERIEYXKP 


6390 


158 


535 


GEERK£GRAPGKAFAPERNPAKMEKEETTREL»L,IjPNWQGSGSHG 

ltiaqrddgvfvqevtqns paartgwkegdqivgati yfdnlq 
sgevtqlumighhtvgi^kijlrkgdrffpsi/3qtwdp 


6391 


5386 


2897 


VRWNS KT E CYL»S I QTQEN F P ANLNE L VN C I V I SSD VTTy kKLKA 
MSLLGSRNQIJU^VLNPNPMDFCTKDLLTTTSERI IAYLRDFNE 
DQKKAI ETAYAMVKHS PS VAK I CLIHGP PGTGKS KT I VGLLYRI* 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corre sponding 
to first 
amino acid 
residue of 
amino acid, 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C-Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G^Glycine, 
H=Histidine, I^Isoleucine, K=Lysine, 
L*s Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q-Glutamine, R=Arginine , 
S=Serine, T= Threonine , V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *«=stop 
Codon. /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








LTENQRKGHSDENSNAKIKQNRVLVCAPSNAAVDELMKKIILEF 
KEKCKDKKNPLGNCGDINLVRJbGPEKS INSEVLKFS LDSQVNHR 
MKKEXPSHVQAMHKRKEFIiDYQLDEX^RQRAXiCRGGREIQRQSI. 
DEN IS KVS KERQELAS KIKEVQGRPQKTQS 1 1 ILESHI ICCTLS 
TSGGLLLESAFRGQGGVPFSCVIVDEAGQSCEIETLTPLIHRCN 
KLILVGDP KQLPPTVI SMKAQEYGYDQSMMARFCRLLEENVEHN 
MISRLPILQLTVQYRMHPDI CLFPSNYV YNRNLiKTNRQTEAI RC 
S SDWPFQPYI.VFDVGDGSERRDNDSYINVQE I KLVMEI I KLI KD 
KRKXJVSFRNIGIITHYKAQKTMIQKDLDKEFDRKGPAEVDTVDA 
FQGRQKDCVIVTCVRANSIQGS IGFLASLQRLNVTITRAKYSLF 
I LGHLRTLMENQHWNQL I QDAQ KRGA 1 1 KTCDKNYRHDAVKI LK 
LKPVLQRSLTHPPTIAPEGSRPQGGLPSSKLDSGFAKTSVAASL 
YHTPSDSKE ITLTVTS KDPERP PVHDQLQDPRLLKRMG I EVKGG 
I Fit WD PQ PS S PQHPGATP PTGE PGFP WHQDIjS HVQQ PAA WAA 
LSSHKPPVRGEPPAASPEASTCQSXCDDPEEELCHRREARAFSE 
GEQEKQ3SETHHTRRNSRITOKRTLEQEDSSSKKRKLL 


6392 


972 


186 


GRTGVDIASSMAHRIiQIRLLTWDVKDTLLRLRHPLGEAYATKAR" 

AHGIiEVEPSALEQGFRQAYRAQSHSFPNYGLSHGLTSRQWWLDV 

VIXJTFHLAGVQDAQAVAPIAEQLYKDFSHPCTWQVLDGAEDTIjR 

ECRTRGLRLAVT SNFDRRLEG ILGGLGIiREHFDFVLTg EAAGWP 

KPDPRIFQEALRLAHMEPVVAAHVGDNYI*CDYQGPRAVGMHSFIj 

WGPQALDPVVRDSVPKEHILPSLAHLIiPALDCLEGSTPGIi 


6393 


2017 


730 


TGGS KMAAVATCGS VAASTGSAVAT AS KSNVTS FQRRG P RAS VT 
NDS GP RLVS IAGTR PS VRNGQLLVS TGL PALDQLLGGGLAVGTV 
LLIEEDJCYNIYSPI/LFKYFliAEGIVNGHTLLVASAKEDPANILQ 
ELPAPLLDD KCKKEFDEDVYNHKTPESNI KMKI AWRYQI*I»P KME 
IGPVSSSRFGHYYDASKRMPQELIEASNWHGFFLPEKISSTLKV 
EPCSLTPGYTKLLQFIQNIIYEEGFDGSNPQKKQRNILRIGIQN ! 
I^SPLWGDDICCAENGGNSHSLTKFLYVIJlGIiLRTSIiSACIITM 
PTHLIQNKAI IARVTTLSDVWGLES FIGSERETNPI/YKDYHGI* 
IHIRQIPRI^OTLICDESDVKDI^KliiaiKLFTIERI^PPDI^D 
TVSRS SKMDLAESAKRLGPGCGMMAGGKKHLDF 


6394 


1418 


511 . 


GAAAGG EGARRRPAAMATVMAATAAERAVliEEE FRWLLHDEVHA 
VLKQLQDI L KEASLRFTL PG SGTEG PAKQENF I LGS CGTD Q VKG 
VLTLO^DAIfSQADVNLKMPRNNQLIiH FAFREDKQWKLQQ I QDAR 
NHVSQAI YLI*TS RDQS YQFKTGAEVLKIi^AVMLQIiTRARlJRLT 
TPATLTLPE IAASGLTRMFAPAIjPSDLLVNVYINLNKIiCLTVYQ 
LHALQ PNS TKNFRPAGGAVLHS PGAM FE WGSQRLE VSHVHKVEC 
VIPWLNDALVYFTVSIjQLCQQLKDKI SVFSSYWS YRP F 


6395 


13 


658 

• 


PSGRPTRPLCCAARRGAARHGGSVSG WPAGRTPTETSNPGS S VM 
ESVTFEDVAVE FI QEWALLDS ARRSLC KYRMLDQ CRTLASRGTp 
PCKPSCVSQLGQRAEPKATERGILRATGVAWESQLKPEELPSMQ 
DLLEEASS RDMQMGPGLFLRMOIA/PS I EERETPLTREDRPAIiQE 
PPWSLGCTGIiKAAMQIQRWI PVPTLGHRNPWVARDSGE 


6396 


1 


1221 


ANHjSSPSKRGQKGTtilGYSPEGTPLYNFMGDAFQHSSQSIPRF 
IKESLKQILEESDSRQIFYFIiCIjNIJiFTFVELFYGVLTNSIjGIiI 
SDG FHMLFDCS ALVMGIjFAALMSRWKATR I FS YG YGRI E ILSGF 
INGI^LIVIAFFATFMESVARLIDPPELDTHMLTPVSVGGIilVNL 
IGICAFSHAHSHAHGASQGSCHSSDHSHSHHMHGHSDHGHGHSH 
GSAGGGMNANMRG VTLHVLADTLGSIGV I VSTVL I EQ FGW FI AD 
PLCSLFIAlDIFLSVVFLIKDACQVLLLRLPPEYEKELHIAIiEK 
IQKIEGLI SYRDPHFWRHSAS IVAGTIHI QVTSDVLEQRI VQQV 
TGILKDAGVNNLTIQVEKEAYFQH^GLSTGFHDVIJVMTKQMES 
MKYCKDGT Y I M 


6397 


391 


122 


GAGGVGRF^AIRAPARMIEWCNDRI/jKKVRVKCNTDDTIGDLK 
KLIAAQTGTRWNKI VLKKWYT I FKDHVS LGDYE IHDG MNLEL, YY 
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Dwonnr.ir> <wo 
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PCTAJS00/34263 



SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
anu.no acid 
sequence 


Amino acid segment containing signal peptide 
(A= Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Pnenylalanine, G=Glycine, 
H^Histidine, I-Isoleucine, K~Lysine, 
L= Leu cine , M=Methionine , N=Asparagine , 
Ps=Proline, Q=Glut amine, R=Arginine, 
S =Ser ine , T= Threonine , V=Val ine , 
W=s Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\-possible nucleotide insertion) 








Q 


6398 


353 


1306 


HKQMGPLIiTOCKXIbLPTTVPPATMRIWLIXSGLLPFLIJ^LSGIXJ 

RPTEGSEVAIKIDPBPAPGSFDDQYQGCSKQVMBKLTQGDYFTK 

DIFAQKNYFRMWQKAHIJIWI^OC^ 

SNVHSDFTRAMASVARTPO^YERSPllFKyiJrXTLTS 

S I M ENGXL CYEVHYRTKDVH FNAYTGATIR FG Q FL>STS li L» KEEA 

QEFGNQTLFTIFTCIXjAPVQyFSLKKEVLIPPYEIiFTCVINMSra 

PRGDWLQLRSTGNLSTYNCQLIiKASS KKCIPDPI AI ASLS FL»TS 

VIIFSKSRV 


6399 


75 


1245 


PNIiETYFGRRCEKDSMNFTPTHTPVCRKRTWSKRGVAVSGPTK 
RRGMADSLESTPI^SPEBRIjAKLHPSKELLEYYQKKMAECEAEN 
EDI^KJCIjELYKEACEGQHKIjECDLC^REEEIAELQKAI^DMQVC 
LFQEREHVLRLYSENDRIJIIRELEDKKKIQNIjIALVGTDAGEVT 

yfckepphkvtilqktiqavgeceqsessafkadpkiskrrpsr 

ERKES S EMYQPJDIQTLILQVEALQAQLGEQTKLSREQIEGIj I ED 
RR IKLEE I QVQHQRNQNKI KELTKNliKHTQELLY ESTKDFLQLR 
SENQNKEKSWMLEKBNLMSK I KQYRVQCKKKEDKIGKVLPVMHE 
SHHAQSEYI KVMSLCRNEWYFSGRVEGI PKNLQFVM 


6400 


2520 


1053 


KTMKCDEVVYEVQSAIIJiHNCGYAMKTGKFFPiNIiMERKDFETWli 
DNISVTFLSIiTDLQra^ETI^IlL,ISLSGAVQLRHLSNNLETLLKR 
DFLKLLPL.EX»S FYLLKWLDPQTLLTCCLVS KQWNKVISACTEVW 
0/TACKNI*GWQIDDSVQDAIjHWKKVY1»KAILiRMKQI*EDHEAFETS 
SLIGHS ARVYALYYKDGLLCTGSDDLSAKLWDVSTGQCVYGI QT 
HTCAAVKFDEQKLVTGSFDNTVACWEWS SGARTQHFRGHTGAVF 
SVDYNDELDILVSGSADFTVKVWALSAGTCLNTLTGHTEWVTKV 
VLQKCKVKS LLHS PGDYILLSADKYE I KI WP IGRE I NC KCIjKTT* 
SVSEDRSICLQPRLHFDGKYIVCSSALGLYQWDFASYDIIiRVIK 
TP E I ANliALLG FGD I FALLFDNRYLYIMDLRTESI>ISRWPIiPEY 
RKSKRGS S FLAGEASWLNGLDGHNDTGLVFATSMPDHS IHLVLW 
KEHG 


6401 


109 


766 


PGAAWSRPDIjRGCCTGPQPALRMLVL.PSPCPQPIAFSSVETMEG 
PPRRTCRSPEPGPSSSIGSPQASSPPRPNHYLCIDTQGVPYTVI, 
VDEESQRE PGASGAPGQKKCYS CPVCSRVFE YMS YLQRHS ITHS 
EVKPFECDI CGKAFKRASHLARHHS IHLAGGGRPHGCPLCPRRF 
RDAGEIAQHSRVHSGERPFQCPHCPRRFMEQNTLQKHTRWKHP 


64 02 


1196 


279 


TTSQCGGI RQSSAI PVASMEFAAI CLRNAIjLLIjPEEQQDPKQEN 
GAKNSNQLGGNTESSESSETCSS KSHDGDKFI PAPPSSPURKQE 
LENLKCS I IACSAYVALAICTNLMALNHADKLLQQPKLSGSLKF 
LGHL YAAEAX. I S LDR I S BAITHLNPENVTDVS LG ISSNEQDQGS 
DKGEWEAMESSGKRAPO^PSSVNSARTVNLFm^SAYCLRSEY 
DKARKCLHQAASM IHP KEVPPEAI I»LiAVYl»E LQNGNTQLAIjO; I 1 
KRNQLLPAVKTHSEVRKKPVFQPVHPIQPIQMPAFTTVQRK 


6403 


2 


1690 


RG IHTSVLQGNLQNQMYSHNWIMNIiNNLNLTQVQQRNLI TNLQ 
RS VDDTSQAIQRIKNDFQN1jCX2VFLQAKKDTDWLKEKVQS LQTL 
AANNSAIiAKANNDTIiEDMNS QLNS FTGQMENI 1 1 XSQANEQNIjK 
DLQDLHKDAENRTAIKFNQLEERFQLFETD I vni isnis ytahh 
LRTLTSN LNEVRTTCTDT LTKH TDD LT S LNNTLAN I RDD S VS I*R 
MQQDIiMRSRLDTEVANIiSVIMEEMKLVDSKHGQLIKNFTILQGP 
PGPRGPRGDRGSQGPPGPTGNKGQKGEKGEPGPPGPAGERGPIG 
PAGPPGERGGKGS KGSQGPKGSRGS PGKPGPQGPSGDPGP PGPP 
GKEGLPGPQGPPGFQGLQGTVGEPGVPGPRGLPGLPGVPGMPGP 
KGPPGPPGPSGAWPLALQNEPTPAPEDNSCPPHl'WaiFTDKCYY 
FSVEKE I FEDAKLFCEDKS SHLVFINTREEQQW I KKQMVGRESH 
WIGLTDSERENE WKWLDGTS PD Y KNV7KAGQPDN WGHGHGPGEDC 
AGL I YAGQWND FQCEDVNNFI CEKDRETVLS SAIi 


6404 J 


1012 


222 


AAAIiAMAAPMGLISVFSSSQELGAAIiAQLVAQRAACCIiAGARA 
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SEQ 
ID 

NO- i 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A= Alanine, C«Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F»Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K^Lysine, 
I>= Leucine, M=Methionine, N=Asparagine, 
P= Proline, Q-Glut amine, R=Arginine, 
SsSerine , T^Threonine , V=Val ine , 
W=Tryptophan, Y=Tyrosine, X»Unknown, *=Stop 
Codon, /=poasible nucleotide deletion, 
\=possible nucleotide insertion) 








RFAIjGLSGGSbVSMLARELPAAVAPAGPASLARWTLGFCDERLV 
P FDHAEST YGLYRTHLLSRLP I PESQV I TI NPELPVEEAAED YA 
KKLRQAFQGDS I PVPDLLILGVGPDGHTCSLFPDHPKLQERBKI 
VAPI SDS P KP P PQRVTLTLPVLNAARTVI FVATGEGKAAVLKR I . 
LEDQEENPLPAALVQPHTGKIjCHFl^EAAARIjLTVPFEiaiS PL 


6405 


1 


1456 


AAI^RPTPRAPIjGREGTGSDSEMAASMFYGRLVAVATIiRNHRPR 
TAQRAAAQVLGSSGLFNNHGLQVG^QO^RNIjSLHEYMSMEXJjQE 
AG VS VP KG YVAKS PDEAYAIAKKLGSKDWIKAQVLAGGRGKGT 
FESGLKGGVKI VFS PEEAKAVSSQMIGKXLFTKQTGEKGRI CNQ 
VLVCERKYPRREYYFAITMERSFQGPVLIGSSHGGWIEDVAAE 
TPEAI I KEP IDIEEG I KKEQALQLAQKMGFPPNIVESAAJENMVK 
I^YSIjFLKTOATMIEINPMVEDSDGAVTjCMDAKINFDSNSAYROK 
KI FDLQDWTQEDERDKDAAKANLNY IGLDGN IGCLVNGAGLAMA 
TMDI IKLHGGTPANFLDVGGGATVHQVTEAFKblTSDKKVLAI L 
VNTF(5GTMRC!DVIAOGIVMAVKDIiEIICC PVVVRLQGTRVDDAKA 
L IADSGLKI LACDDIiDEAARMW KLSE I VTLAKQAHVDVKPQLP 
I 


6406 


1036 


167 


HPRQMRGEDTPEAP P YSSGR YDS I KTEVSGCPEDIiTVGRAPTAD 
DDDDDHDDHEDNEKMNDSEGMDPERLKAFN^ 

VP T S KOPKE KT OAI I E S CS ROFPE FOERARKR IRTYLKS CRRMK 
KNGMEMTRPTP PHLTSAMAENT LAAACE^ ETRJCAAKRMRLE I YQ 
SSQDEPI ALDXQHS RDSAAI THST YSLPAS S YS QDP VYANGGLN 
YSYRGYGALSSNLQPPASLQTGNHSNGESGEARALASRPAPSWV 
CRAALGSGMGRGKQRPVMERGCLTA 


6407 


492 


150 


VGLCIAVSQTVLAQLDAL1.VFPGQVAQLSCTIjS PQHVT I RDYGV 
SWYQQRAGSAPRYLLYYRSEEDHHRPADIPDRFSAAKDEAHNAC 
VLTIS PVQPEDDADYYCS VGYGFS P 


6406 

• 


1458 


903 


RGCITSSQAWRLFGGVTRGFNMRI EKCYFCSGPI YPGHGMMFVR 
NDCKVFRFCKS KCHKNFKKKRNPRKVRWTKAFRKAAGKELTVDN 
S FEFEKRRNEPI KYQRELWNKTI DAMKRVEE I KQKRQAKFIMNR 
bKKNKEtX3KVQDIKEVKQNIHLIRAPLAGKGKQLEEKMVQQLQE 
DVDMEDAP 


6409 


150 


446 


NTAliANLLRCFTCDRLCGGCTAPAP PAHQG I VLQP VM PS CDPGP 
GPACL PTKTFRS YLPRCHRT YS CVHCRAHLAKHDEL I S K S FQGS 
HGRAYIiFNSV 


6410 


85 


607 


RGGTAGCVACIjGCWGQSSSPKAAFPAGSACIjPADSCPCLIjFQAC 
AISGLFNCITIHPLNIAAGVWMIMNAFIIJjLCEAPFCCQFIEFA 
NTVAEKVDR1*RS WQKAV F YCGMAVVP I VT S LT LTTLLGNAI A FA 
VLYGLSALGKKGDAI S YARI QQQRQQADEEKLAETLBGEI* 


6411 


302 


772 


RLSIMASSLNEDPEGSRITYVKGDLFACPKTOSLAHCISEDCRM 
GAG IAVLFKKKFGGVQELLNQQKKSGEVAVLKRDGRYI YYXi ITK 
KRASHKPTYENliQKSLEAMKSHCLKNGVTDLSMPRIGCGIiDRLQ 
WENVSAMIEEVFEATD1KITVYTL | 


6412 


61 


1709 

* 


RP VTS FS PLPGS CGGRI/GTRTMLGRS IjRE VSAAI*KQGQ I T PTKL. 
CQKCLSLI KKTKFI^AYITVSEEVALKQAEESEKRYKNGQSLGD 
LDG I P I AVKDN FSTSG I ETTCASNMLKG YIPP YNATVVQKliI*DQ 
GALI^GKTNLDEFAMGSGSTIXJVFGPVKNPWSYSKQYREKRKQN 
PHSEKEDSDWIjI TGGSSGGSAAAVSAFTCYAALGSDTGGSTRNP 
AAHCGLVGFKPS YGLVSRHGLIPLWSMDVPGI LTRCVDDAAI V 
LGALAGPDPRDS TTVHE P INKPFMLP S LADVS KI*C IG I P KE YLV 
PELSS E VQSLWS KAADLFES EGAKVI EVS LPHTS YS I VC YHVLC 
TS EVASNMARFDGLQ YGHRCD I D VS TEAM YAATRREG FND WRG 
RII^GNFFLLKENYEKYFVKAQKVRRIiIANDFWAFNSG^ 
TPTTLSEAVP YLEFIKEDNRTRSAQDDI FTQAVNMAGIjPAVS I P 
VALSNQGLP IGLQ F I GRAF CDQQLLTVAKWFE KQ VQFP VI QLQE 
LMDDCSAVLENEKIiASVSIiKQ 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
<A=Alanine, C= Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=*Lysine, 
Ii=Leucine , M=Methionine, N=Asparagine, 
P= Pro line, Q=Glutamine, R=Axginine, 
S=Serine, T»Threonine, V-Valine, 
N=Tryptophan, Y=Tyrosine, X= Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 


6413 


2 


885 


HEPRCAGMAASLWMGDI^PYMDEOTISRAFATMGETVMSVKI IR 
NRLIG 1 PAGYCFVEFADLATAE KCLHKI NGKPLPGATPAKRFKL 
NYATYGKQPDNS PE YS LFVGDLTPDVDDGML YE FFVKVYPSCRG 
G KWLDQTG VSKG YG FVKFTDELEQKRALTECQGAVGLGS KPVR 
hS VAI PKASRVKPVEYSQMYSYSYNQYYQQYQNYYAQWGYDQNT 
GSYSYSYPQYGYTQSTMOTYEEVGDDALEDPMPQIjDVTEANKEF 
MEQSEELYTXALMDCHT^QPLDTVSSEIPAMM 


6414 


1 


538 


RGGRAALLPWRRFPCCR PR PQPAR PSSRATPG prspgmats igv 
S FSVGDGVPEAEKNAGEPENTY I LRPVFQQR FRPSWKDCIHAV 
LKEELANAE YS PEEMPQLTKHLS EN I KDKLKEMG FDR YKMWQV 
VIGEQRGEGVFMASRCFWDADTDNY^HDVFMNDSLFC^VAAFGC 
FYY 


641S 


2 


1168 


FVRQWQSSHRRACGIiGCEARAGGGEEPRGRASSVAGWVGAFRAP 
FIEAAVAGLGAGSGKKRRGWKMPVHSRGDKKETNHHDEMEVDYA 
ENEGSSSEDEDTESSSVSEDGDSSSMDDEDCERRRMECLDEMSN 
IjEKQFTDI>KDQLYKERI*SQVDAKI*QEVIAGKAPEYI*EPIjATI*QE 
NMQ I RTKVAG I YREL CX»ES VKNKYECE I Q AS RQHCES E KT J »T » YD 
TVQSEI^EKIRRI^EDRHSIDITSELWNDELQSRKKRKDPFWPD 
KKKPGWSGP Y I VYMLQDIjD I IiED WTTI RKAMATLGPHRVKTE P 
P VKLEKHIiHS ARS E EG RL YYDG EW Y I RGQT I CIDKKDECPTSAV 
ITTINHDEVWFKRPDGSKSKLYISQLQKGKYS IKHS 


6416 


410 


1519 


EI APADLE I PACAP VLLiS RATSS TMS VTGGKMAPS LTQE I LSH1» 
GLAS KTAAWGTLGTLRTFIJSrFS VDKDAQRLLRAITGQGVDRSAI 
VDVLTNRSREQRQL I SRNFQERTQQDLMKS LQAAI>SGNIjERIVM 
ALLQPTAQ FDAQEliRTALKASDS AVDVAI E I LAIRTP PQLQE CI* 
AVYKHNFQVEAVDGI TSETSG I LQDT.T J.AIAKGGRDSYSGIIDY 
NLAEQDVQAI/QRAEGPS RE ET WVP VFTQRNPEHLIRV FDQ YQRS 
TGQELEEAVQNRFHGDAQVALIiGLAS VIKNT PIjYFADK1*HG^I»Q 
ETEPNYQVL IRI LISRCETDLLS I RAEFRKKFGKSLYSS LQDAV 
KGDCQSALIiAIiCRAEDM 


6417 


1 


845 


RGESRVLWS ELEGEAGGAGGWAS SLNARMDNRFATAFVIACVLS 
LISTIYMAASIGTDFWYEYRSPVQENSSDLNKSIWDEFISDEAD 
EKTYNDALFRYNGTVGLWRRCITI PKNMHWYSPPERTESFDWT 
KCVSFTLTEQFMEKFVDPGNHNSGIDLLRTYLWRCQFLLPFVSL 
GLMCFGAIj I G1»CACI CRSI* YPTI ATG I LHLLAGLCTLGS VS CYV 
AG I ELLHQKLEIiPDNVSGEFGWS FCIiACVSAPLQFMASALFIWA 
AHTNRKEYTIiMKAYRVA 


6418 


2 


662 


TRTRPRRPPGI*GAAVGKAGARSTS TPAGAS PAAAYQADP PP PAH 
TPAPPPPPPCGGIACHGEPAKFYGYDNLQRQP I FTTQQEAELVQ 
YPDCKSSSGNIGEDPDHLNQSSSPSQMFPWMRPQAAPGRRRGRQ 
TYSRFQTLEIiE KEFLFNP YLTRKRRI EVSHALALTEROVKI WFQ 
NRRMKW KKENNKDKFPVSRQEVKDGETKKEAQE LEEDRAEGIiTN 


6419 


1 


973 


PGRPRVRNFDLNSKS ILQEFFCTRS IQI PANRSKTAMSKCPI FP 
MARSISTSGPIJ)KEDTGRQKI,ISTGSLPATLC<3ATDSI>GLEWHL, 

PSPD PVTVP YLS PL WWKELESUjENEGDHAI TVAD FVDHHP I V 
FWNI»VWYFRRLDLPSNLPGI* I LSSBHCNKYSKI PRHCMSEDS KY 
VT.IQMLVroNMKLHQDPGQPLYILWNAHTQKYPMVHLLQKSDNS F 
NQELLKSMVKS I KMNDVYG PMSQIIiETLNKCPHFKRQRSLYREI 
LFLS LVALGRE NIDI DAFDKE YKMA YDRLTP S QVKS THNCDR P P j 
STGVMECRKTFGEPYIi 


6420 


207 


1187 

• 


RKMIDKNQTCGVGQDSVP YMICLIH I LEEWFGVEQLEDYLNFAN 
YIXWVFTPLILLIL.PYFTIFXDYLTIIFI.HIYKRKNVLKEAYSH 
NLWDGARKTVATIiWDGHAAVWHGYEVHGMEKIPEDGPALIIFYH i 
GAIPIDFYYFMAKIFIHKGRTCRVVADHFVFKI PGFSLLLDVFC 
ALHGPREKCVEILRSGHLIAISPGGVREALISDETYNIWGHRR 
GFAQVAIDAKVPI I PMFTQNI REGFRSLGGTRLFRWLYEKFRYP 
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SEQ 
ID 
NO: 


Predicted, 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Px-edicted end 
aucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide - ^ 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenyl alanine, G=Glycine, 
HssHistidine, I=Isoleu cine, K=Lyslne, 
L=Leucine, M=Methionine, N^Asparagine, 
P=Proline, Q=Glutaraine, R=Arginine ( 
S=Serine, T^Threonine , V= Valine, 
W=Tryptophan, Y-Tyrosine, X=Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








FAPMYGG FP VKLRTYLGDP I P YDPQI TAEELAEKTKNAVQALI D 


6421 


1844 

• 


362 


WALSLRRQPERWSNKL.LSPHPHSWLRS EFKMASSPAVLRASRL 
YQWSLKSSAQFLGS PQLRQVGQI IRVPARMAATLILEPAGRCCW 
DEPVRI AVRGIiAPEQPVTLRASLRDE KGALFQAHARYRADXLGE 
LDIiERAP ALGGS FAGLE PMGIil*W ALEPEKPLVRIjVKRD VRT P LA 
VET.RVIiDGHDPDPGRIiLCQTRHERYFLP PGVRREPVRVGRVRGT 
LF^PPEPGPFPGIVDMFGTGGGLLEYRASLLAGKGFAVMALAYY 
NYEDLPKTMETLHLEYFEEAMNYLLSHPEVKGPGVGLLGISKGG 
E LCLS MAS FL KG I TAAVVINGS VANVGGTLR YKGETLP PVG VNR 
NRI KVTKDG YAD IVDVLNS PLEGPDQKS FI P VBRAES TFTjFLVG 
QDDHNWKSEFYANEACKRLQAHGRRKPQI ICYPETGHYIEPPYF 
PIiCRAS LHALVGS P 1 1 WGGEPRAHAMAQVDAWKQLQTFFHKHLG 
GREGTIPSKV 


6422 


181 

* 


2133 


EGENIjSWFQEFWGDIAKEFYWKTPCPGPFLRYNFDVTKGKIFIE 
WMKGATTNICYNVLDRNVHEKKLGDKVAFYW^ 
HQLI*VQVCQFSNVLRKQGIHKGDRVAlYMPMIPELVVAMIiACAR 
IGALHSIVFAGFSSESLCERII^SSCSIjIjITTDAFYRGEKIjVNL, 
KELADEALQKCQEKGFPVRCCI WKHLGRAELGMGDSTSQSPP I 
KRSCPDVQISWNQGIDLWWHELMQEAGDECEPEWCDAEDPIjFIIj 
YTSGSTGKPKGVVHTVGGYMLYVATTFKYVFDFHAEDVFWCTAD 
IGWITGHSYVTYGPIiANGATSVLFEGI PTYPDVNRLWS I VDKYK 
VTKFYTAPTAIR1^KFGDEPVTKHSRASIX2VI^TVGEPINPEA 
WIjWYHRVVGAQRCPIVDTFWQTETGGHMIiTPLPGATPMKPGSAT 
FPFFGVAPAI I^ESGEELEGEAEGYLVFKQPWPGIMRTVYGNHE 
RFETTYFKKPPGYYVTGDGCQRDQDGYYWITGRI ddmlnvsghl 

lstaevesalveheavaeaawghphpvkgeclycfvtlcdght 
fspklteelkkqirekigpiatpdyiqnapglpktrsgkimrrv 
lrkiaqndhdlgdmstvadpsvi shlfshrci/t iq 


6423 


624 


1237 


ANI/KE I PRDIiPPETVLI*YI#DSNQ I TS I PNE I FIODLHQLRVIjMLS 
R IANNPWHCDCTIjQQVIjRSMASNHETAHNVI ckts vldehagrp 

flnaandadlcnlpkkt^dyamlvtmfgwftmvisyvvyyto 
qedarrhle ylkslpsrqkkadepdd istw 


6424 


1 


1188 


KKVSWPVAAMVHCSC^FRKYGWFIDKIiRLFTRGGSGGMGYPRI> 
GGEGGKGGDVVTVVAHNRMTLKQLKDRYPRKRFVAGVGANSKISA 

GGKLL.TNFLPLKGQKRI IHLDLKLIADVGLVGFPNAGKSSLLSC 
VSHAKPAIADYAFTrLKPELGKIMYSDFKQISVADLPGLIEGAH 
MNKGMGH KFLKHI ERTRQLIiFVVDI SGFQLSSHTQ YRTAFE TI I 
LXjTKELELYKEEL.QTKPALI^ 

KDFLHIiFEKNM I PERTVEFOHI IPISAVTGEGIEELKNCI RKSI* 
DEQANQENDAIiHKKQLLNLW I SDTMSSTE PPS KHAVTTS KMDI I 


6425 


1850 


1144 


ZiAMEGGGGl PIjETLKEESQSRHVLPAS fevnsi»qksnwgfli*tg 
LVGGTLVAVYAVATPFVTPAI^KVCLPFVPATMKQIENWKMIoR 

crrgslvdigsgdgrjviaaakkgftavgyei^wlvwysryra 

WREGVHGSAKFY I SDLWKVTFSQYSNVVT FGVPQMMLQIjE kkle 
RELEDDAR VlACR FP FPHWTPDHVTG EG r DTVWA YDASTFRGRE 
KRPCTSMHFQIiPIQA 


6426 


30 


565 


SRGAAVGGMSVAGGEIRGDTGGEDXAAPGRFSFSPEPTLEDIRR 
LHAE FAAERDWEQ FHQPRNL LLAL VGE VGELAELFQ W KTDG E P G 
PQGWSPPJSRAAI^EEI^DVLIYIjVALAA^ 

tmRRYPAHLARSSSRKYTELPHGAISEDQAVGPADI PCDSTGQT 
ST 


6427 


145 


959 


AASWGPPHVPKAGKMVSWMI CR1.WLVFGMLCPAYAS YKAVKTK 
NIREYVRWMMY>?1^FALFWAAEIVTDIFISWFTFYYEIKMAFV1» 
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SEQ 
ID 
NO: 


Predicted 

beginning 

nucleotide 

location 

cor re spondi ng 

to first 

amino acid 

residue of 

amino acid 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=l.eucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glutamine, R^Arginine, 
S= Serine , T=Tnreonine, V=Valine, 
W=Tryptophan , Y^Tyrosine, X= Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








W LI»S P YTKGAS IxL YR KFVH PSLSRHE KE I DAY I VQAKERS YETV 

IiS FGKRGIjN iaasaavqaatksqgalagrlrs fsmqdlrs xsda 

PAPA YHDPL Y1»KDQVSHRRP PI GYRAGG LQDS DTED ECWSDTEA 
WRAPARPREKPLIRSQSLRVVKRKPPVREGTSRSLKVRTRKKT 
VPSDVDS 


6428 


1982 


444 


SGSGG KMEDHQHV P I DI QTS KLItDWUVDRRHCS LKWQS1»VL»TI R 
E K I NAAI QDMP ESEE IAQ LIiSGS Y IHYFHCLRI LDLLKGTEAST 
KNI FGRYSSORMXDWQE I IALYEKDNTYLVELSSLLVRNVNYEI 
PSLKKQIAKCQQLQQEYSRKEEECQAGAAEMREQFYHSCKQYGI 
TGENVRGELIJVL.VKDLPSQLAEIGAAAQQSLGEAIDVYQASVGF 
VCES PTEQVLPMLRFVQKRGNSTVYEWRTGTEPSVVERPHLEETL 
PEQVAEDAID WGDFGVEAVSEGTDSGI SAEAAG I DWG I F PES DS 
KDPGG DG IDWG DDAVALQ I TVLEAGTQ AP EGVARG PDAL TL»LEY 
TETRNQFI*DELMEIiE2 FI*AQRAVELS EEADVLS VSQFQLAPAI L» 
<^G^KEKJ^v™VS VLiEDL IGKLTSLQLQHLFMl LAS PR YVDRVT 
EFMQKLKQSQLLALKKEIjMVQKTOEALEEQAAIjEPKLDLLLEK 
TKEI»QKL>I EAD I S K R YSGR P VNLMGTS t> 


6429 


3413 


3442 


EPSSWTAAPRGPUAAHPJLEAAVQEDDRRALS FDSRI kvfangti* 

wksvtdkdagdylcvarnkvgddyvvlkvdvvmkpaki ehkee 
ndhkvfyggdlkvdcvatglpnpeisws lpdgslvnsfmqsdds 
ggrtkrywfnngtlyfnevgmreegdytcfaenqvgkdemrvr 
vkwtapat1 rnxtclavqvpygd wtvaceakgepmp kvtwlis 
ptnkv i pts sekyq i yqdgtll i qkaqrs dsgnytclvrnsage " 
drktvwihvnvqppkjngnpnpittvreiaaggsrklidckaeg 
iptprvlwafpegvvi,papyygnritvhgngsi*dirslrxsdsv 
qlvcmarneggearlivqlt vlepmekp ifhdp i sekitamagh 

TISLNCSAAGTPTPSLVWVLPNGTDLQSGQQLQRFYHKADGMIjH 
I SGLS S VDAGAYRCVARNAAGHTERIiVS LKVGLKP EANKQYHNL 
VS 1 3tNGETIiKLPCTP PGAGQGRFSWTLPNGMHLEG PQTTjGRVSIj 
LDNGTIiTVREAS VFDRGTYV CRMETE YGPSVTS I PVTVIAYPPR 
ITSEPTPVTY1!RPGNTVKLNCMAMGI PKADITWELPDKSHLKAG 
VQARLYGNRFLHPQGSLTIQHATQRDAGFYKCMAKNIIjGSDS kt 
TYIHVF 


6430 


1946 


602 


rtrvstgi*rrtllwseavgasstrgdtg i pgsgeggagpgggeg 
amleamaepspedppptlkpetqppekrrrtiedfnkfcsfvla 
yagyippskeesdwpasgssspiirgesaadsdgwdsapsdiirti 
qtfvkkaksskrraaqagptqpgpprstfsrlqapdsatllekm 

KLKDS LFDLDG PKVAS PLS PTSLTHTS R PPAALTP VPLSQGDIjS 
HPPRKKDRXNRKLGPGAGAGKGVLRRPRPTPGDGEKRSRIKKSK 
KRKLKKAERGDRLPPPGPPQAPPSDTDSEEEEEEEEEEEBBEMA 
TWGGEAPVPVLPTP PEAPRPPATVHPEGVPPADSESKEVGSTE 
TSQDGDAS SSEGEMRVMDEDI MVESGDDSWDLITC YCRKPFAGR 
PMIECSLCGTWIHLSCAKIKKTNVPDFFYCQKCKELRPEARRLG 
GPPKSGEP 


6431 


3 


605 


WWNSSYNLPAYAPYLPCEACAMQDGRKGGAYAGKMEATTAGVGR 
LEEEAIiRRKERLKALR^KTGRKDKEDGEPKTKHLREEEEEGEKH 
RELRLRNYVPETDEDLKKRRVPQAKPVAVEEKVKEQLEAAKPEPV 
I EEVDLANIAPRKPDWDI^DVAXKLEKLKKRTQRAIAELIRER 
LKGQEDSLASAVDAATEQKTCDSD 


6432 


56 


1692 


GGLGTMGSRIKQNPETTFEVYVEVAYPRTGGTbSDPEVQRQFPE 
D YSDQEVLQTLTKFCFP FYVDSLTVSQ VGQNFTFVLTD I DS KQR 
FGFCRLSSGAKSCFCILSYLPWFEVFYKI^NIIJVDYTTKRQENQ 
WMELLETLHKLPI PDPGVSVHLS VHS YFTVPDTRELPS I PENRN 
LTE YFVAVDVNNMLHLYASMLYERRILI I CSKLSTLTACIHGSA 
AMLYPMYWQHVYIPVLPPHLIJ)YCCAPMPYLIGIHI>SI^EKVRN 
MALDDWI LNVDTNTIiETPFDDIjQS LPNDVI S SLKNRI*KKVSTT 
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SEQ 
ID 
NO: 


Predicted 

beginning 

nucleotide 

location 

c or re spondi ng 

to first 

amino acid 

residue of 

amino acid 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
{A- Alanine , C=» Cysteine , D=Aspartic Acid, E= 
Glutamic Acid, F» Phenyl alanine, G=Glycine, 
H»Hi3tidine, I=Isoleucine, K-Lysine, 
L-Leucine, M=Methionine, N=Asparagine , 
P=Proline, G^Glutatnine, R=Arginine , 
SsSerine, T=Threonine, V= Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








TGDGVARAFIjKAQAAFFGS yrnalki epebp itfceeafvshyr 

SGAMRQFI^NATQIXJLc KQFIDGRLDIal^SGEGFSDVFEEEINM 
GBYAGSDKLYHQWLSTVRKGSGAIWIVCT 

AENGCAPTPEEQLPKTAPS PLVEAKDPKLREDRRPITVHFGQVR 
PPRPHWKRPKSNIAVEGRRTSVPSPEQNTIATPATLHILQKSI 
TKFAAKFPTRGWTSSSH 


6433 


1524 


484 


AP VTKRKEVFAKDS KGSAItDAGRD PKRPALPBTTiCESGWASNTA 
PTTPPQPGWCiCGKDFKSSCX3TPGREKERRIiATMHGSCS FLMLL 
LPLLliLriVATTGPVGALTDEEKRLMVEIiHNIj YRAQVSPTAS DML 
HI^WDEE1«AAFAKAYARQCVWGHNKE RGRRG ENI*FAITDEGMDV 
PLAMEEWHHEEEHYNIiSAATCSPGQMCGHYTQVVWAKTERIGCXS 
SHFCEKLQGVEETNIELIjVCNYE P PGNVKGKRP YQ EGTPCSQCP 
SGYHCKNSLCEPIGSPEDAQDI*PYIjVTEAPSFRATEASDSRKMG 
AEGPDKPSWSGLNSGPGHVWGPLLGliIjLLPPIiVLAGIF J 


6434 


40 


2002 

■ 


MPQLNFGMADPTQMGGLS^LIAGEHAJ^GTPEVFSGTCRPDVSE 
SPELRQKSPIjFQFAEISSSTSHSDASTKQCQTSAXiFQEAEISSN 
TSQI/3GAEPV7CRCGKSAIjKQLAEMCnaA£EGM 

DGGRIKELEKGKEEKEIKMEXTDETRMKEAEFEKSAKENLRDS 
KELRNFEALQIDDIMAIKMEDPKE IRXEELEEDHKCSHFPDFSY 
SASSKI I ISDVPSRKDHMCHPHGIMI IEDPAALNKPEKLKKKKK 
KSKMDRHGNDKSTPKKTCKKRQSSESD I ES VI YT I EAVAKGDWG 
XEKLGDTPRKKVRTSSSGKGS ILDAKPPKKKVKSREKKMSKEKS 
SDTTKESRPPDFISISASKNISGETPEGIKAEPLTPMEXJALPPS 
LSGQ AKPEDSDCHRKI ETCGSRKSERS CKGALY KTLVSEGMLTS 
LRftNVDRG KRS SGKGNSSDHEGCWNEES WT FSQSGTSGSKKFKK 
TKPKEDCXlLGSAKLDEEFEKKFNSLPQYS PVTFDRKCVPVPRKK 
KKTGNVSSEPTKTSKGSGDKWSNKQLFLDAIHPTEAIFSEDRNT 
MEPVHKVKNI PS I FNTPEPTTTARTFGGQPKEKSKENPDYSPCQ 
DTQRAGYimEEVLWMTNIJ^mCGGWLKQLRHTAMTNA 


64 3 5 


2227 


657 


AI>QRDAAAAYAHPEYEERFliQEETVSQQINSIEXI^TRPLAI*PE 
VVKSQRPLQRQVHIiRGRPASQPTVTRG ITYYKAKVSE E END I EE 
QQDEFFSGDNGVDLIjIEDQLLRHNGI^TSVTRRPAATRQGHSTA 
VTSDLNARTAPWSSALPQPSTSDPSIANHAS VGPTLQTTSVS PD 
PTRESVLQPS PQVPATTVAHTATQQPAAPAPPAVS PREALMEAM 
HTVPVPPTTVRTDSLGKDAPAGRGTTPAS ptlspeeeddirnvi 
GRCKDTLSTI TCPTTQNTYGRNEGAWMKDPIAKDERI YVTNYYY 
GNTTj\^FRN3jENFKQGRWSNSYKLPYSW JGTGH\A/YNGAFYYNR 
AFTRNI I KYDriKQRYVAAWAMLHDVAYEEATPWRWQGHSDVDFA 
VDENGLWLIYPALDDEGFSQEVlVLSKIiNAADI»STQKETTWRTG 
LRRNFYGNCFVICGVLYAVDS YNQRNANI SYAFDTHTNTQIVPR 
LLFENE YFYTTQIDYNPKDRLLYAWDNGHQVTYHVI FAY 


6436 


1295 


341 


GACRPP VRQD PDSGPD YEAXjPAGATVTTHMVAG AVAG I LEHC VM 
YPIDCVKTRMOSLQPDPAARYRNVLEALWRI I RTEGIj WRPMRGI* 

VATLLHDAAMNPAE WKQRMQM YNS PYHRVTDCVRAVWQNEGAG 
AFYRS YTTQI»TMNVP FQAIHFMTYE FLQEHFNPQRR YNPS S HVL 
SGACAGAVAAAATTPLDVCKTLLNTQESI tALNSHITGH I TGMAS 
AFRTVYQVGGVTAY FRGVQARVI YQ I PSTAI AWS VYEFFKYLIT 
KRQEEWRAGK 


6437 


1828 


360 


P P AP AP PAS PAR HVTRTARGRLEGGS RAP PLLQAVFLQ I KNMVK 
LIHTIADHGDDWCCAFSFSLIATCSnDKTIRliYSLRDFTELPH 
SPLKFHTYAVHCCCFS PSGHIIASCSTDGTTVLWNTENGQMIAV 
MEQPSGS PVRVCQ FSPDSTCLASGAADGTVVLWNAQS YKLYRCG 
S VKDGS tAACAFS PNGSFFVTGSSCGDI»TVWDDKMRCLESEKAH 
DLG I TCCDFS SQP VSDGEQGLQF FRIAS CGQD CQVKI W I VS FTH 
ILGFELKYKSTLSGHCAPVIACAFSHDGQN5LVSGSVDKSVIVYD 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A= Alanine, C=Cysteine, D-Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine , G=Glycine, 
H=Histidine, I=Isoleucine, K~Lysine, 
L= Leucine , K=Methionine, N=Aspa rag ine , 
P=Proline, Q=Glutamine, R=Arginine, 
S=S erine, ^Threonine, v= Valine, 
{^Tryptophan , Y= Tyrosine, X=Unknown, +=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








TNTENILHTLTQHTRYVTTCAFAPNTf J .T.ATGSMDKTVN I WQFD 
LETLCQARSTE^QLKOFTEDWSEEDVSTWLCAQDLKDLVGI FKM 
NNI DGKKLLNLTKESIADDLKIESI^LJtSKVLRKIEEIiRTKVKS 
LS S G I PDE P I CP ITRELMXDPV IAS DGYSYEKEAMENWDPAKRN 
RTSPP 


6438 


109 


901 


EVQIIjRAJOiFQTGGIilVFYGIiliAQTMAQFGGIiPVPLI^bPI^ j 

NPALPLSPTGLAGSLTNAI^NGIiLSGGLIiGILENLPLLDILKPG 

GGTSGGLLGGltliGKVTSVT PGLNN I IDIKVTDPQLLELGLVQSP 

DGHRLYVTI PI^IKLQVim^VGASIJaRIAVKlJDI^ 

KQER I HLVLGDCTHS PGS LQI SI*LDGI*GPI*P I QGLLDSLTG ILN 

KVL PELVQGNVCPLVNE VIiRGLDI TLVHDIVNML I HGIiQFVI KV 


6439 


23 


412 


SIQTASAITTEMASQSQG I QQUiQAEKRAAEKVADARKRKARRL 
KOAKEEAQMEVEQ YRREREHE FQS KQQAAMGSQGNXiSAEVEQAT 
RRQVQGMQSSQQRNRERVliAQI^MVCDVRPQVHPNYRISA 


6440 


3 


517 


RARWNSDMGDLiPGLVRIjS I ALR IQ PNDGPVF Y KVDGORFGONRT 
I KLLTGSS YKVEVKI KPSTLQVENIS IGGVLVPLELKSKEPDGD 
RWYTGTYDTEGVTPTKSGERQP IQITMPFTD IGTFETVWQVKF 
YNYHKRDHCQWG S PFSVI EYECKPNETRSLMWVNKESFL 


6441 


234 


1373 


KSGGliRRRQRPGRSAAVGE EELP PGMEKFKAAMLIjGS VGDALGY 
RNVCKENSTVGMKIQEELQRSGGLDHIiVLSPGEWPVSDNT I MHI 
ATAEAIjTTDYWCLDDLYREMVRCYVEIVEKLPERRPDPATIEGC 
AQLKPNNYLtiAWHTP FNEKGSGFGAATKAMC3 GLRYWKPERLET 
L 1 EVSVE CGRMTHNHPTGFLGSIjCTAI*FVSFAAQGKPI>VQWGRD 
MLRAVPIJ^EYCRKTIRHTAEYQEHWFYFEAKWQFYLEERKISK 
DSENKAIFPDNYDAEEREKTYRKWSSEGRGGRRGHDAPMIAYDA 
LLAAGWSWTEIiCHRAMFHGGESAATGTIAGCLFGIiLYGLDLVPK 
GI>YQDLEDKEKLEDLGAAIiYRI.STEEK 


6442 


34 


796 


AEXlPAGGIAGQDTMFARGLKRXCVGHEEDVEGALAGbKTVSSYS 
LQRQSLIJJMSIiVKLQIXHMLVEP^CRSVLIANTVRQIQEEMTQ 
DGTWRTVAPC^ERAPLDRLVSTEILCRAAWGQEGAHPASGLGD 
GHTQG P VSDLCP VTS AQAPRHLQSSAWEMDGPRENRGS FHKSLD 
QIFETLETKNPSCMEELFSDVDSPYYDLDTVLTGMMGGARPGPC 
EGLEGLAPATPGPSSSCKSDIiGELDHWEILVET 


6443 


2 


555 


MASPAASSVRPPRPKKEPQTIiVIPKNAAEEQKLKLERLMKNPDK 
AVPIPEKMSEWAPRPPPEFVRDVMGSSAGAGSGEFHVYRHIiRRR 
EYQRQDYMDAMAEKQKLDAEFQKRIjEKNKIAAEEQTAKRRKKRQ 
KI.KEKKI1I1AKKMKLEQKKQEGPGQPKEQGSSSSAEASGTEEEEE 
VPSFTMGR 


6444 


390 


899 


GSTPRGKMRAPI PEPKPGDL I E I FRPFYRHWAI YVGDGYWHEA 
PPSET^GAGAASV^ALTDKAIVKKELI/YDVAGSDKYQVNNKHD 
DKYSPI*PCSKI IQRAEEL1VGQEVI1 YKLTS ENCEHFYNELR YGVA 
RSDQVRDVI IAASVAGMGLAAMSLIGVMFSRNKRQKQ 


6445 


2 


753 


AGAAGAAGAARS PRPQAHTKGVRGLPSRRRS PDCGRMEIAAGS F 
SEEQFWEACAELO^PAIAGAD WQLLVETSGI S I YRLLDKKTGLY 
E YKVFGVLEDCS P TLLADI Y>fDSD YRKQWDQ YVKELYEQE CNGE 
TVVYWEVKYPFPMSNRDYVYI^QRRCLDMEGRKIHVILARSTSM 
PQLGERSGVIRVKQYKQSIAIESDGKKGSKVFMYYFDNPGGQIP 
SWblNWAAKNGVPNFLKDMARACQNYT.KKT 


6446 


1 


1651 


RCPTRSPPPDTPGSRGTTAMCSLASGATGGRGAVENEEDLPELS 

DSGDEAAV7EDEDDADLPHGKQQTPCLFCNRLFTSAEETFSHCKS 

EHQFNI DSMVHKHGLEFYGYI KIiINF3CKLKNPTVEYMNS I YKP V 

PWEKEEYLKPVLEDDI*LIX2FDVEDIjYEPVSVPFSYPNGLSENTS 

WEKLKHMEARALSAEAAIJU^AREDLQKMKQFAQDFVMH^ 

CS S STSVXADLQEDEDGVYFS S YGHYGIHEEMLKDKI RTES YRD 

FI YQNPH I FKDKWUDVGCGTG I I^SMFAAKAGAKKVLGVDQSEI 
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ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 

nucleotide 

location 

c orresponding 

to first 

amino acid 

residue of 

amino acid 

sequence 


Amino acid segment containing signal peptide 
(A= Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Hist idine, I=Isoleucine, K=Lysine, 
L=Leucine, M= Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine , 
S -Serine , T= Threonine , V- Valine , 
W=Tryptophan, Y=Tyrosine # X=Dnknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








LYQAMDI IRliNKIiEDTITLI KGKIEE VHLP VEKVDVI I SEWMG Y 
FIAjFESMIjDSVL YAKNK5fl«AKGGS VY PD I CT ISLVAVSDVNKHA 
DRIAFWDDVYG FKMS CMKKAVI PEAWEVLD PKTL I S EPCGIKH 
IDCHTTSISDI^FSSDFTLKITRTSMCTAIAGYFDIYFEKNCHN 
RVVFSTGPQSTKTHWKQTVFLLEKPFSVKAGEAI^KVT^/HKNK 
KDPRSLTVTLTLNNSTQTYGLQ 


6447 


1554 


1068 


RliG PAEWHI^GPCHATLGAANRGRALGVRAAWRGAPI,CQRVMMP 
SRTNIATGI PSS KVKYSRIrS S TDDG Y I DLQ PKKTPP K I PYXAIA 
LATVL.FLIGAFLI 1 1 G SLLLSGY I S KGGADRAVPVLI IG I LVFL 
PG FYHLR I A YYASKG YRGYS YDD I PD FDD 


6448 


74 


559 


GQVLSHCYHYRSSRWRRGGLSRGRGAGVMALVPYEBTTEFGLQK 
FHKPIiATFS PANHTTOTRODWRHIjGVAAVVWDAAIVI»STYLFMG 

AVEIiRGRS AVELGAGTGL VG I VAALIiACR I RYERDNN FLAMLER 
QFIVRKVHYDPEKDVHIYEAQKRNQKEDL 


6449 


597 


1876 


E YG VCENliRKLE ITGVS CRD VYAKLLHR YRH I LGLWQ PD IGPYG 
GLLNWVDGLFI IGWMYLPPHDPHVDDPMRFKPLFRIHIjMERKA 

FRTWLREEWGRTLEDI fhehmqeli lmkfiytsqydncltyrri 

YLPPSRPDDLIKPGLFKGTYGSHGIxEIVT1LSFHGRRARGTKI , 1X5 
DPN I PAGQQTVE I DLRJITII QLPDLENQRNFNELSRI VLEVRERV 
RQEQQEGGHEAGEGRGRQGPRESQPSPAQPRAEAPSKGPDGTPG 
EDGGE PGDAVAAAEQPAQCGQGC; P FVLPVGVS SRNEDYPRTCRM 
CFYGTGIiI AGHGFTS PERTPGVFIIjFDEDRFGFVWLELKS FSliY 
SRVQATFRNADAPSPQAFDEMLKNIQSLTS 


6450 


848 


269 


fvpaprtvsgkrslfgeweergegeqrtgrefsgnggraveaar 
mrll cg lwlw ls uikviiqaqtptpl plpppmqs fqgnqfqgewf 
vlgiiagnsfrpehrt^l^aftatfel^ddgrfevwnamtrgqhc 

DTWSYVLIPAAOPGOFTVDHRVWTHEOAGR PQDQPAGOELVAAS 
RDAGPVHLPGQSSGPLG 


6 4 51 


232 


939 


HSPTPPTSPRASTMEDVKLEFPSLPQCKEDAEEWTYPMRREMQE 
XLPGIiFIjGP YS S AMKS KLP VLQKHG I THI I CI RQNTEANF I KPN 
FQQLFRYLVLDIADNPVENI IRFFPMTKEFIDGSliQMGGKVLVH 
GNAG I SRSAAFVIAYIMETFGMKYRDAFAYVQERRFCINPNAGF 
VHQLQE YEAI YliAKLTI QMMS PIjQ I ERSIjS VHSGTTGSLKRTH E 
EEDD FGTMQ VATAQNG 


6452 

• 


1 


652 


RTRGESSNME PI*AA YPDKCS G PRAKVFAVLLS I VLCTVTIj FLLQ | 
LKFLKPKrNSFYAFSVKDAKGRTVSLEKYKGKVSLWNVASDCQ 
LTDRNYLGLKEl^KEFGPSHFSVLAFPCNQFGESEPRPSKEV^ 
FARKNYGVTFPIFHKIKII^SEGEPAFRPXVDSSKKEPRWNFWK 
YLVNP EGQWKFWRPEEP I EVIRPD IAAItVRQVI I KKKEDIi 


6453 


827 


223 


HRRWLPGbSMSPRRTLPRFLSLCljSLCLCl^LAAALGSAQSGSC 
RDKKNCKVVFSQQELRKRLTPLQYHVTQEKGTESAFEGEYTHHK 
DPGIYKCVVCGTPliFKSETKFDSGSGWPS FHDVINSEAITFTDD 
FS YGMHRVETSCSQCGAHLGHI FDDGPRPTGKRYCINSAALSFT 
PADSSGTAEGGSGVAS PAQADKAEL 


6454 


827 


223 


HRRWLPGLSMSPRRTLPRPLSLCLSLCLCTiCLAA 
RDKKNCKWFSQQELRi<KLTPIX5YHVTQEKGTESAFEGEYTHHK 
DPGIYKCVVCGTPDFKSETKFDSGSGWPS FHDVINSEAITFTDD 
FS YGMHRVETS CSQCGAHLGH I FDDGPR PTG KRYCINS AALS FT 
PADSSGTAEGGSGVAS PAQADKAEL 


6455 


1042 


173 


RVHLATVSASAAWDALGLPVRSHMQGS TRRMG VMTDVHRRFLQL 
I^THGVLEEWDVKRLOTHCYKVKDRNATVDKLEDFINNINSVLE 
SLYIEI KRGVTEDDGRPI YALVN1ATTS IS KMATDFAENELDLF 
RKALELIIDSETGFASSTNILNLVIX^KGKJCMRKKE^ 
VQmCWLIEKEGEFTLHGRAILEMEQYIRETYPDAVKICN I CHS L | 
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Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D^Aspartic Acid, E= 
Glutamic Acid, F=Phenyl alanine, G=Glycine, 
H=Histidine. I=Isoleucine, K=Lysine, 
L=Leucine, M= Methionine, N=Asparagine, 
P= Proline, Q=Gl ut amine , R=Arginine, 
S -Serine , T=Threonine , V=VaIine , 
W^Tryptophan, Y- Tyrosine , X=Unknown , *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 



LIQGQSCBTCGIRMKLPCVAKYFQSNAEPRCPHCNDyV7PHEIPK 
VFD PEKERES GVLKSNKKS LRSRQH 



6456 



555 



RPQSRS I SMWRNSLLQVS SGLRWLRVCAMVD I LGERHLVTCKGA 
TVEAEAALQNKVVALYFAAARCAPSRDFTPKLCDPYTAIjVT^EAR 
RPAPFEWFVSADGSSQFJ4I^FMRELHGAWLALPFHD PYRHELR 
KR YNVTAI P KLVIVKQNGEVT TN KG RKQ I RERGLAC FQ D WVEAA 
DIFQNFSV 



6457 



23 



892 



PTTGFPVTNFPWNWPDGKPPIMILYVSKLNKI IHFPDFDKKIPV 
KLFPLPLLYVGNHISGLS STSKLS LPMFTVLRKFT I PLTLLLET 
1 1 LGKQYSLNI I LSVFAI ILGAFIAAGSDLAFNLEGY I FVFLND 
I FTAANGVYTKQKMD P KELGKYGVLFYNACFM I IPTLI I SVSTG 
DLQQATEFNQWKNWFIIjQFXiIjSCFI/SFIJJMYSTVIjCSYTOSAIi 
TTAWGAI KNVSVAY I G I IiIGGDY I FS LLNFVG LN I CMAGGLR Y 
SFLTLSSQLKPKPVGEENICLDLKS 



6458 



23 



892 



PTTGFPVTNFPWNWPDGKPPIMILYVSKLNKI IHFPDFDKKIPV 
KLFPLPLLYVGNH ISGLS STS KLS LPMFTVLRKFTI PLT LTJ . K T 
IILGKQYSLNIILSVFAI I LGAF I AAGSDLAFNLEG Y I FVFIjND 
IFTAANGVYTKQKMDPKEL^KYGVLFYNACFMI IPTLI I SVSTG 
DLQQATEFNQ WXNVVF I LQ FLLS C FLG FLLMYS TVIjCS YYNSAL 
TTAWGAI KNVSVAY IG I LIGGDY I FS LLNFVGLN I CMAGGLR Y 

SFLTLSSQLKPKPVGEENICLDLKS 

PTTGFPVTNFPWNWPDGKPPIMILYVSKLNKI IHFPDFDKKIPV 



6459 



23 



892 



6460 



23 



892 



6461 



1653 



360 



6462 



773 



6463 



350 



6464 



12 



1154 



KLFPLPLLYVGNHISGLS STSKLSLPMFTVLRKFTIPLTLLLET 
1 1 LGKQYSLNI I LSVFAI I LGAFIAAGSDLAFNLEGY I FVFLND 
I FTAANG VYTKQKMDPKE LGKYGVLFYNACFM 1 1 PTL 1 1 SVSTG 
DLQQATEFNQWKNWFI LQFLLSCFLiGFLLMYS TVLCS YYNS AL 
TTAWGAI KNVSVAY I GI L IGGDY I FSLLNFVGLNI CMAGGLRY 
SFLTLSSOLKPKPVGEENI CLDLKS 

PTTGFPVTNFPWNWPDGKPPIMILYVSKLNK1 IHFPDFDKKIPV 
KXFPLPLLYVGNHISGLSSTSKLSLPMFTVLRKFTIPLTLLLET 
1 1 I^KQYSLN 1 1 I^VFAI I LGAFIAAGSDLAFNLEGY I FVFLND 
I FTAANGVYTKQKMDPKELGKYGVLFYNACFMI I PTL 1 1 SVSTG 
DLO^ATEFNQWKNVVFILQFLLSCFLGFLI^STVLCSYYNSAL 
TTAVVGAIKNVSVAYIGILIGGDYIFSLLNFVGLNICMAGGLRY 

SFLTLS SQLKPKPVGEENICLDLKS 

LO^RTLRITAVGQraPIAWMAWEPSLGAFYGPASFITFWCMYF 
I^IFIQLKRHPERKYF^KEPTEEQQRLAANENGEINHQDSMSLS 
LISTSALENEHTFHSQLLGASLTLLLYVALWMFGALAVSLYYPL 
DLVFS FVFGATSLSFSAFFWHHCVNREDVRIAWI MTCCPGRSS 
YSVQVNVQPPNSNGTNGEAPKCPNSSAESSCTNKSASSFKNSSQ 
GCKLTNLQAAAAQ CHANS LPLNSTPQLDNSLTEHSMDND I KMHV 
APLEVQFRTNVHSSRHHKNRSKGHRASRLTVLREYAYDVPTSVE 
GS VQNGLPKSRLGNNEGHSRSRRAYLAYRERQ YN P PQQDS SDAC 
STLPKS S RNFEKPVSTTS KKDALRKPAWELENQOKS YGLNLAI 
QNG P I KSNGQEG PLLGTDSTGNVRTGLWKHETT V 



S EELDRE KKLKEDS PR KTPN KE S G VPS L P VS LTS I KESPKEAKH 
PDSQSMEBS KLKNDDRKTPVNW KDSRGTRVAVS S PMSQHQSY I Q 
YLHAYP YPQMYDP SHPAYRAVS P VLMHS YPGAYLS PGFHY P VYG 
KMSGREETEKVNTS PS VNTKTTTES KALDLLQQHANQYRS KS PA 
PVEKATAERFJ2EAERERDRHSPFGQRHLHTHHHTHVGMGYPLI P 

GQ YD P FQGLTSAALVASC/QVAAQASA SGMFPGQRRE 



VILCILGGWIFKNADRSMEKKKGEPRTRAEARPWVDEDLKDSSD 
LHQAEBDADEWQES E ENVEH I P FSHNHY P E KEMVKRS QEFYELL 

NKRRSVRFISNEQVPMEVIDNVIRTAGL 

GI LRQKEREERNR I H KKE I LFLEHLLW PSEMSS LSGKVQTVLG" 
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ID 
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corresponding 
to f irsfc 
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residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A- Alanine, C= Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, P= Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K= Lysine, 
Ls Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glu t amine , R=*Arginine, 
S=Serine, T=Thxeonine, V«=Valine, 
W =Tryp t ophan , Y=Tyrosine, X«Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\apossible nucleotide insertion) 








IiVEPSKLGRTLTHEJtLAMTFDCCYCPPPPCQEAISKEPIVMKNIj 
YWIQKNAYSHKENIiQIjNQETEAI KEELLY FKANGGGALVENTTT 
Gl SRDTQTLKRLAEETGVHI I SGAGFYVBATHSSETRAMSVEQL 
TDVLMNEILHGADGTSIKOGI IGEIGCSWPLTESERKVLQATAH 
AQAQLGCPVI IHPGRSSRAPFQI IRI WEAGADISKTVMSHLDR 
TI LDKKELLEFAQI^CYl*EYDLyGTELlfiYQLGPDIDMPDDNKR 
I RRVRLLVEEG CEDR I LVAHD IHTKTRLMKYGGHGY SH I LTNW 
PKMLLRG 1 TENVLDKILI ENPKQWLTFK 


6465 


126 


1396 

• 


KMTVFFKTLRNHWKKTTAGLCLJ^llVGGHWLYGiCHCD^I.LRRAAC 
QEAQVFGNQL I PPNAQVKKATVFLNPAACXGKARTLFE KNAAP I 
LHLSGMDVTIVKTDYEGOJ^KLLELMENTDVI IVAGGDGTLQEV 
VTGVLRRTDEATFS KI P IGFIPIiGETSSLSHTLFAESGNKVQHI 
T]^TI^VKGETVPLDVIX3IKGEKEQPVFAOTGLRWGSFRDAGV 
KVSKYWYIiEPIjKI KAAHFFSTLKEWPQTHQASISYTGPTERPPN 
E P EET PVQRP S L YRR I LRRIiAS Y W AQ PQDALS OEVS PE VW KDVQ 
LSTI EI*S I TTRNNQLDPTSKEDFLNI CIEPDT ISKGDFI TIGSR 
KVRNPKXHVEGTECLQASQCTrLL.1 PEGAGGS FS I DSEE YEAMP V 
EVICLLPRKLQFFCDPRKREQMLTSPTQ 


6466 


1134 


828 


VARGTELSQLEKAHPPADMGRRKS KRKPPPKKKMTGTfcETQFTC 
PFCNHEKSCDVKMDRARNTGVISCTVCLEEFQTPITYLSEPVDV 
YSDWIDACEAANQ 


6467 

• 


301 


2571 

- 


GEIiRVLALAHGELACHAVL,TASLLSLRSRLMDSDMDYER PNVET 
I KCWVGDNAVGKTRLI CARACNATLTQYQLLATHVPTVWAIDQ 
YRVCQEVIjERSRDWDDVSVSIjR1>WDTFGDHHKDRRFAYGRSDV 
VVXCFSIANPNSIiHHVKTMVOTEIIOIFCPRAPVILVGCQLDLRY 
ADLEAVNRARRPIARPIKPNEILPPEKGREVAKELGIPYYETSV 
VAQFG I KDVFDNAI RAAIiI SRRHIjQ FWKSHLRNVQRPIxLQAP FL 
PPKPPPPIIVVTDPPSSSEEC^AHlxLEDPLCADVILVLQERVRI 
FAHKIY1*STSSSKFY15LF1jMDLSEGEIjGCPSBPGGTHPEDHOG1I 
SDQHHHHHHHHHGRDFIJ^RAAS FD VCES VDEAGGSGPAGLRAST 
SDGILRGNGTGYLPGRGRVLSSWSRAFVSIQEEMAEDPI»TYKSR 
LMVWKMDSSI QPGPFRAVLKYLYTGELDENERDLMHIAH IAEL. 
LE VFDLRMMVANI IiNNEAFMNQE I TKAFHVRRTNRVKEdiAKGT 
FSD VTF I LDDGTI S AHKPLLI S SCDWMAAMFGG PFVESS TPJB W 
FPYTSKSCMRAVLEYLYTGMFTSSPDLDDMKLI ILANRIiCLPHI* 
VALTEQ YTVTG LMEATQMMVD I DGDVX.VFLELAQ FHCA YQLADW 
CXHHICTNYNNVCRKFPRDMKAMSPENQEYFEKHRWPPVWYL.KE 
EDHYQRARKER^KEDYIiHliKRQPKRRWLFWNSPSSPSSSAASSS 
SP5SSSAW 


6468 


3 

* 


1374 


DAWAGTKMAAIAPVGSPASRGPRIiAAGLRXJ^MIX5I>U3LLAEPG 
I^VHHIiALKDDVRHKVHLNTTO 

KDVTIGFSLDRTKNDGFSSYLDEDVNYCIUCKQSVSVTLLI 1»DI 
SRSEVRVKSPPEAGTQlrPKI I FSRDEKVIiGQSQBPNVNPASAGN 
O/TQKTQDGGKSKRSTVDSKAMGEKSFSVHNNGGAVSFQFFFNIS 
TDDQEGLYSLYFHKCIjGKELPSDKFTFSI»DIEITEKNPDSYIjSA 
GE I PI»P KI*YISMAF FFFIiSGT IWIHI LR KRRNDVFKI HWLMAAIj 
PFTKSUSLVFHAIDYHYISSG^FPISGWAVVYYITHIJL»KGALLF 
ITIALIGTGWAFI KHILSDKDKKIFMIVIPRRVIiANVAYI I IES 
TEEGTTE YGLWKDS I*FIjVDLXiCCGAII*FPVVWS I RHLQEAS ATD 
GKGKFS RAHFVLLS LL 


6469 


3 


1374 


DAWAGTNMAALAPVGS PASRGPRIxAAGLRLLPMLGLLQLLAEPG 
LGRVHHIJ^KDDVRHKVHIOTFGFFKrX3YMVVNVSSI>^ 
KDVTIGFSIjDRTKNDGFSS YLDEDVNYCI LKKQSVSVTLLILDI 
SRSEVRVKS PPEAGTQLPKI I FSRDEKVLGQSQEPNVNPAS AGN 
QTQKTQDGGKSKRSTVDSKAMGEKSFSVHNNGGAVSFQFFFNIS 
TDDQEGL YSLYFHKCIiGKEIiP SDKFT FSLDIE ITEKNPDS YItS A 
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ID 
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Predicted 
beginning 
nucleotide 
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corre sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predxcted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A-Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K= Lysine . 
L= Leucine, M=Methionine, N=Asparagine, 
P= Proline, Q=Glut amine, R==Arginine, 
S=Ser ine , T= Threonine , Va Val ine , 
W= Tryptophan , Y^Tyrosine, X=Dn3mown , *»Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








GEI PLPKLY I SMAFFFFLSGTIWIHIIJXKRRNDVFKIHWI^AAL 
PFTKSLSLVFHA1DYHYISSQGFPIEGWAVVYYITHLLKGALLF 
ITIALlGTGWAFIKHILSDKDKKIFMrVlPRRVLANVAYIIIES 
TEEGTTEYGLWKDSLFLVDLLCCGAILFPVWSIRHLQEASATD 
GKGKPSRAHFVLLSLL 


S470 


2726 


1437 


AAASGVSSRADAPVLAQSPASAGNGRPSTPRVPGSRRHPSAPRS 
GPLPREIXSCRTPGPQLLPLPGALI^PRTLLSSAAETGRSRHPDT 
QHPSSGGRCRGGTES PSSAAGRPASMAEAEEDCHSDTVRADDDE 
ENES PAETDLQAQLOMFRAQWMFELAPGVS S SNLENR PCRAARG 
SLQKTSADTKGKQEQAKEEKARELFLKAVEEEQNGALYEAIKFY 
RRAMQLVPDI EPKI TYTRSPDGDGVGNSYI EDNDDDSKMADLLS 
Y FQQQLTPQES^KLCQPEIXSSQIHISVLPMEVLMYI FRWWS 
SDIJDLRSLEQI^LVCRGFYICARDPEIWRLACX.KVWGRSCIKLV 
PYTSWREMFLERPRVRFTX5VYISKTTYIRQGEQSLDGFYRAWHQ 
VEYYRYIRFFPDGHVMMLTTPEEPQSIVPRLRTR 


6471 


1750 


299 


FFFDKMAAGGSGVGGKRSSKSDADSGFLGLRPTSVDPALRRRRR 
GPRNKKRGWRRLAQEPLGLEVDQFLEDVRLQERTSGGLLSEAPN 
EKLFFVDTGS KEKGLTKKRTKVQKKSLLLKKPLRVDLI LENTS K 
V PAP KDVLAHQ V PNAKXLRRKEQLWE KLAKQGE LPREVRRAQ AR 
LLNPSATRAKPGPQDTVERPFYDLWASDNPIJDRPIiVGQDEFFLE 
QT KKKGVKR PARLHTK P S QAP AVEVAP AGAS YNP S FEDHQTL LS 
AAHEVEI/QRQKEAEKI^RQLALPATEQAATQESTFQELCEGIJUB 
ESDGEGEPGCK3EGPEAGDAEVCPTPARLATTEKKTEQQRRREKA 
VHRLRVQQAALF^ARLRHQELFRLRGIKAQVALRLAELARRQRR 
RQARREAEADKPRRLGRLKYQAPDIDVQLSS ELTDS LRTL KP EG 
NILRDRFKSFQRRNMIEPRERAKFKRKYTCVKLVEKRAFREIQL 


6472 


3 


897 


S CGSDRAQWAMEFPFDVDALFPERITVLDQHLRP PARR PGTTTP 
ARVDLQQQIMTI IDELGKASAKAQNLSAPITSASRMQSNRHWY 
ILKDS5ARPAGKGAIIGFIKVGYKKLFVLDDREAHNEVEPLCIL 
DFYIHESVQRHGHGRELFQYMLQKERVEPHQLAIDRPSQKLLKF 
LNKHYNLETTVPQVNNFVI FEGFFAHQHRPPAPSLRATRHSRAA 
AVDPTPAAPARKLPPKRAEGD I KPYSSSDRE FLKVAVEPPWPLN 
RAPRRATPPAHPPPRSSSLGNSPERGPLRPFVP 


6473 


22 


912 


SSAVEFVWEGEKMAAEPNKTE IQTLFKRLRAVPTNKACFDCGAK 
NPS WAS I TYGVFLCI DCSGVHRSLG VHLS FI RSTELDSNWNW FQ 
LR CMQ VGGNANATAFFH QHG CTANDANTKYNS RAAQMYRE KI RQ 
LGSAA1ARHGTDLWIDNMSSAVPNHSPEKKDSDFFTEHTQP PAW 
DAPATEPSGTQQPAPSTESSGLAQPBHGPNTDLIiGTSPKASLEL 
KSSI I G KKKP AAAKKGLGAKKGLGAQ KVS S QS FSEI ERQAQVAE 
KLREQ QAADAKKQAE ES MVAS MRLA YQELQ I DR 


6474 


3 


462 


LQRQRQHPAAAPAVPVRCFTFCFTDIVIMPKRKSPENTEGKDGS 
KVTKQEPTRRSARLSAKPAPPKPEPKPRKTSAKKEPGAKISRGA 
KGKKEEKQEAGKEGTAPSENGETKAEEIHISRSTVNVSTSRGTP 
PSTLSVKGOIETVRVKGTEN 


6475 


3 


462 


I^RQRQHPAAAPAVPVRCFTFCFTDIVIMPKRKSPENTEGKDGS 
KVTKQEPTRRSARLSAKPAPPKPEPKPRKTSAKKEPGAKISRGA 
KGKKEE KQEAGKEGTAPS ENGETKAEE IH I SRSTVNVSTSRGTP 
PSTLSVKGQIETVRVKGTEN 


6476 


106 


1090 


ARAMAQYKGTMREAGRAMHLLKKRERQREQMEVLKQRI AEETI L 
KSQVDKR FSAH YDAVEAELKS S TVGLVTLNDM KARQEALVRERE 
RQLAKRQHLEEQRLQQERQREQEQRRERKRKI SCLS FALDDLDD 
QADAAEARRAGNLGKNPDVDTSFLPDRDREEEENRLREELRQEW 
EAQREKVKDEEMEVTFSYWDGSGBRRTVRVRKGNTVQQFLKKAL 
QGLRKDFLELRSAGVEQLMFI KEDLILPHYHTFYDF I IARARGK 
SGPI^SFIJVHDDVRLLSDATMEKDESHAGKVVLRSWYEKNKHI F 

PASRWEAYDPEKKWDKYTIR 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A^Alanine , C=Cysteine, D=Aspartic Acid, B= 
Glutamic Acid, P= Phenyl alanine, G^Glycine, 
Hs=Histidine, I^Isoleucine. K= Lysine, 
L=Leucine, M^Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S-Serine , T=Threonine , V»Val ine , 
W=Tryptophan, Y=Tyrosine, X- Unknown, *=Stop 
Codon, /s=possible nucleotide deletion, 
\=possible nucleotide insertion) 


6477 


227 


915 


LQGHLMG I MAAS RP LS RFWEWG KN I VCVGRNY ADH VREMRS AVL 
SEPVIjFliKPSTAYAPEGSPIIJrtPAyTRNljHHELEI/SVVMGKRCR 
AVTEAAA[^YVGGYALCIiD^ARDVQDECKKKGLPWLLAKS fta 
SCPVSAFVPKEKI PDPHKLKLWLKVNGELRQEGETSSMI FS I P Y 
1 1 S YVSK1 1 TLEEGD 1 1 LTGTP KGVGPVKENDE I BAG I HGLVSM 
TFKVEKPBY 


6478 


2 


149S 


FVSSRILPESLASSEASTLEAMGRKEEDDCSSWKKQTTNIRKTF 
I FMEVLGSGAFSEVFLVKQRLTGKLFALKCI KKSPAFRDSSLEN 
EIAVLKKIKHENIVTLEDIYESTTHYYLVMQLVSGGELFDRILE 
RGVYTEKDASLVIQQVLSAVKYLHETOGIVHRD^ 
ENSKIMITDFGLS KMEQNG IMS TACGTPG YVAPEVLAQKP YS KA 
VDCWS IGVITYI LLCGYPPFYEETES KLFEKI KEQYYEFES PFW 
DDISESAKDFICHLLEKDPNERYTCEKALSHPW I DGNTALHRDI 
YPSVSLQIQKNFAKSKWRQAFNAAAVVHHMRKLHMNL^ PGVRP 
EVENRPPBTQASBTSRPSSPEITITEAPVLDHSVAIjPALTQLPC 
QHGRRPTAPGGRS lnclvngs LH I S SSLVPMHQGSLAAGPCGCC 
SSCLNIGSKGKSSYCSEPTI*LKKANKKQNFKSEVMVPVKASGSS 
HCRAGQTGVCLIM 


6479 


3 


949 


SCRGPGWHPAGGQAGAMELLSALSLGELAIiSFSRVPLPPVFDLS 
YFIVSILYLKYEPGAVEI^RRIIPIASV^CAMLHCFGSYIIJ^LL 
LGEPLIDYFSNNSS ILLASAVWYLI FFCPLDLFYKCVCFLPVKL 
IFVAMKEVVRVRKIAVGIHHAHHHYHHGWFVMIATGWVKGSGVA 
LMSNFEQIiljRGVWKPETNEII^HMSFPTKASLYGAILFTI^ 
LPVS KAS L I F I FTLFMVS CKVFLTATHSHS S PFDALEGYI CPVL 
FGSACGGDHHHDNHGGSHSGGGPGAQHS AMPAKS KEELS EGSRK 
KKAKKAO 


6480 


192 


514 


DFMS I YFP I HCPDYLRS AKMTE VMMNTQ PMEE IGLS PRKDGLS Y 
QIFPDPSDFDRCCKLKDRLPSIWEPTEGEVESGELRWPPEEFL 
VQEDEQDNCEETAKENKEQ 


6481 


110 


1131 


KSRMDLDWNMFVIAGGTLAI P ILAFVAS FLLWPSALI RI YYWY 
WRRTLGMQVRYVHHEDYQFCYS FRGRPGHKPS I LMLHGFSAHKD 
MWLSVVKFLPKNLHLVCVDMPGHEGTTRS SLDDLS IDGQVKRIH 
QFVECLKLNKKPFHLVGTSMGGQVAGVYAAYYPSDVSSLWLVCP 
AGLQYSTONQFVQRLKBLQGSAAVEKIPLIPSTPEEMSEMLQLC 
S YVRFKVPQQILQGLVDVRI PHNNFYRKLFLEI VSBKSRYSLHQ 
HMDKIKVPTQI I WGKQDQVLDVS GADMLAKS IANCQVELLENCG 
HSWMERPRKTAKLI IDFLASVHNTDNNKKLD 


6482 


2517 


568 


BP VS KVSQSRRKAGVPTAN I EE S QAVEAAMANVP WAEVCEKFQA 

ALALSRVEljHKNPEKEPYTCSKYSARAIiEEVX^ 

RPFAEDGPGAGDHALGLPAEVVEPEGPVAQRAVRLAVIEFHLGV 

NHIDTEELSAGEEHLVKCLRLLRRYRLSHDCISLCIQAQNNLGI 

LWSEREEIETAQAYLESSEALYNQYMKEVGSPPLDPTERFIjPEE 

E KLTEQERS KRFE KVYTHNL YYIaAQVYQHLEMFEKAAH YCHSTL 

KRQLEHNAYHPIEWAINAATLSQFYINKLCFMEARHCLSAANVI 

FGQTGKISATEDTPEAEGEVPELYHQRKGE I arcwikycltlmq 

NAQLSMQDNIGELDLDKQSEIJIALRKKELDEEES IRKKAVQFGT 

GE LCDAI S AVEE KVS YLRPLDFEEAREXFLLGQH YVFEAKEFFQ 

IDGYVTDHI EWQDHS ALFKGLAF FETDMERRCKMHKRR IAMLE 

PLTVDLNPQYYLLVNRQIQFEIAHAYYDMMDLKVAIADRLRDPD 

SHIVKKINNLNKSALKYYQLFLDSLRDPNKVFPEHIGEDVLRPA 

MLAKFRVARLYGKI ITADPKKELENLATSLEHYKFIVDYCEKHP 

EAAQEIEVELELSKEMVSLLPTKMERFRTKMALT 


6483 


3 


623 


NSHIJ^CCIAARAPLSANGREARAMEQRIJ^FRAARKRAGLAAQP 
PAASQGAG/rPGEKAEAAATLKAAPGWLKRFLVWKPRPASARAQP 
GLVQEAAQPQGSTSETPWNTAI PLPSCWDQS FLTNI TFLXVLLW 
LVLLGIJVELEFGLAYFVI^LFYWMYVGTRGPEEKKEGEKSAYS 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corre sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to cirst 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A^Alanine, C»=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K^Lysine, 
L= Leucine, M=Methionme, N=Asparagane , 
1 P= Proline. Q=Glut amine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan , Y=Tyrosine, X=Dnknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








VFNPGCBAIQGTLTAEQLERELQLRPLAGR 


6484 


201 


965 


QLAVKTKMSGLRPGTQVDPEI ELFVXAGSDGSSIGNCPFCQRLF 
MILWLKGVKFNVTTVTOITRKPEELKDLA^^ 

, DFIKXEE FLEQTLAPPR YPHLSPKYKES FDVGCNLFAKFSAYI K 
iTTQKFJ^KNFEKSLLKEPKRLDDYljNTPLLDEIDPDSAEEPPVS 
RRLFLDGDQLTLADCSLLPKIiNI I KVAAKKYRDFD I PAE FSGVW 
R YLHNAYAREE FTHTCP EDKE I ENTYANVAKQ KS 


6485 


6 


1091 


FVDLVRAVEFLPCPDSQKLEKECQSSEESMGSNSMRS I LEEDEE 
DEEP PRVLL YHEPRS FEVG^VWHKHKKYPFWPAVVKS VRQRD K 
KASVLYI EGHMNPKMKG FTVSLKS L KHFDCKE KQTLLNQARE D F 
NQDIGWCVSLITDYRVRLGCGSFAGSFLBYYAADISYPVRKSIQ 
QDVLGTKI^QLSKGSPEEPVVGCPLGQRQPCRKMLPDRSRAARD 
RANQKLVEYIGXAKGAE SHLRAI LKS RKPS RWLQTFLS S SQYVT 
CVET YliEDEGQLDLVVKYIjC^VYQE VGAKVMRTNGDR IRFILD 
VIJbPEAIICAISAGDEVDYKTAEEKYIKGPSLSYREKErFDNQL 
LEERNRRRR 


6486 


10 


581 


LVLQAGGAHLS PSRVTQG I Y YMLAFS EMPKPPDYSELSDSLTLA 

GGTGRFSGPLHRAWRMMNFRQRMGW I G VGL YLLASAAAF Y YVFE 

ISETYNRLAIJilHIQQHPEEPLEGTTWTHSLKAQ 

1 FLVP YLQMFLFLYSCTRADPKTVG YCI IPX CLAVI CNRHQAFV 

KASNQISRIiQLIDT 


6487 


352 


863 


SFLKPIiRGKMSVTLHTDVGDIKIEVFCERTPKTCENFLALCASN 
YYNGC I FHRNI KGFMVQTGD PTGTGRGGNS IWGKKFEDE YSEYL 
KHNVRGWSN1ANNGPNTNGSQFFITYGKQPHLDMKYTVFGKVID 
GLETLDETjEKL PVNEKTYRPLNDVHI KDITIHANPFAQ 


6488 


878 


241 


TAI^QEFGTSGPPLSLRFALPSGTGRFKPLFGARGPSWPPSPRVP 
ME P PNTi YPVKL YVYDLS KG LARRL S P I MLG KQLEG I WHT S IWH 
KDEFFFGSGGISSO?PGGTLLGPPDSVVDVGSTEVTEEIFLEYL 
SSLGESLFRGEAYNIiFEHNCNTFSNEVAQFLTGRKIPSYI TDLP 
SEVI^TPFGCALRPLLDS iqiqp PGGSSVGRPNGQS 


6485 


1457 


375 


KVAKMATALS E E E LDN EDY Y S LX.NVRREASS EEL KAA YR R LCMLr 
YHPDKHRDPELKSQAERLFNIiVHQAYEVLSDPQTRAI ydi ygkr 

glemegwewerrrtpaei reeferlqrereerrlqqrtnpkgt 
isvgvdatdlfdrydebyedvsgssfpqieinkmhisqsieapl 
tatdtai lsgsls tqngngggs infaiirrvts akg wgei*e fgag 
dlc^plfgdklfrniitprcfvttncalqfssrgi rpglttvlar 

NIjDKNTVGYIjQWHCSSPIiLQVQRPHRNTRACAPE PS FRP FLHVP 
TWUAECSGARTPSTAOTSAAVKLREACLSGPGSGSHQLLLLTPR 
SKRRTGGG 


6490 


3 


1183 


HEAGCEVWIiGYGPRAAAAAAATVIjFGGAGPTETMFVARS IAADH 
KDLIHDVS FDFHGRRMATCSSDQSVKVWDKSES GDWHCTASWKT 
HSGSVWRVTWAHPE FGQVLAS CSFDRTAAVWEB I VGESNDKLRG 
QSHWVKRTTLVDSRTSVTDVKFAPKHMGLMliATCSADGIVRIYE 

>. »~«i->t rjurvn er»MCT rnn?Te/ , lfT OPCf**TOTitMT)OC!CDIllICDMT7iTr(?Cn 

APD VMNJjoQ W5 1/utih IbL KJ-io Uo L» ± o WW t*i KArtor't*lJ-/tv*j&jj 
DSS PNAMAKVQI FEYNENTRKYAKAETLHTVTDPVHDXAFAPNL 
GRSFHILAI ATKDVRIFTLKPVRKELTSSGGPTKFE IHIVAQFD 
NHNSQVWRVSWNI TGTVLAS SGDDGCTVRIjWTCANYMDNWKCTG I L» 
KGNGS PVNGSSQQGTSNPSLG SNIPS LQNS LNGSSAGRKHS 


6491 


3 


1183 


HEAGCE VWLGYG PRAAAAAAATVLFGGAGPTETMFVARS IAADH 
KDLIHDVSFDFHGRRMATCSSDQSVKVWDKSESGDWHCrASV7KT 
HSGSVWRVTWAHPE FGQVLAS CS FDRTAAVWEE I VGESNDKLRG 
QSHWVKRTTLVDSRTSVTDVKFAPKHMGIoMLATCSADGIVRIYE 
APDVMNLSQWSLQHErSCKLSCSCISWNPSSSRAHSPMIAVGSD 
DSS PNAMAKVQ I FE YNEOTRKYAKAETLMTVTDP VHD I AFAPNL 
GRSFH ILAIATKDVRI FTLKPVRKELTSSGGPTKFE IHIVAQFD 
NHNSQVWRVS WNI TGTVLAS SGDDGCVRLWKANYMDNWKCTGI L 
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Anvxno acid segment containing signal peptide 
(A=Alanine, C*Cysteine, D=Aspaxtic Acid, E= 
Glutamic Acid, P= Phenylalanine, G=Glycine, 
H-Histidine, I=Isoleucine„ K= Lysine, 
L= Leucine , M=Methionine , N=Asparagine , 
P= Proline , Q=Glutamine, R=Arginine, 
S=Serine, T-Threonine, V^Valine, 
W=Tryptophan, Y*= Tyrosine, X»Unknovn, *=Stop 
Codon, /spossible nucleotide deletion, 
\=possible nucleotide insertion) 



SEQ | Predicted 
ID beginning 
NO: nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



Predicted end 
nucl eotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



KGNGSPVNGSSQQGTSNPSLGSNIPSLQNSLNGSSAGJRKHS 



6492 



34 



2573 



IPFLKSCCCCCLFDFPPPPLDQVQEEECEVERVTSHGTPKPFRK 
FDSVAFGESQSEDEQFENDLETDPPNWQQLVSREVLLGLKPCEI 
KRQEVINELFYTERAHVRTIJCVLDQVFYQRVS REG I LS P S ELRK 
IFSm^DIU?I^IIGLNEOMKAVRXPJJETSVIDQIGEDLLTWFSG 
PGEEKLKHAAATFCSNQPFALEMIKSRQKKDSRFQTFVQDAESN 
PLCRRLQLKDI I PTQMQRLTKYPLLLPNIATYTBWPTEREKVKK 
AADHCRQILNYVNQAVXEAENKQRLED YQRRLDTSS LKLS EYPN 
VEELRNLDLTiGiKMIHEGPLVWKVNRDKTIDLYTLLLEDILVLL 
QKQDDRLVLRCHSKILASTADSKHTFS P VI KLS TVLVRQVATDN 
KALFVISMSDNGAQI YELVAQTVSEKTVWQDLI CRMAASVKEQS 
TKP I PL PQSTPG EGDNDEED PSKLKEEQHGI S VTGLQS PDRDLG 
LESTLISSKPQSHSLSTSGKSEVRDLFVAERQFAKEQHTDGTLK 
EVGEDYQIAIPDSHLPVSEERWALDALRNLGLIiKQLLVQQLGLT 
EKS VQEDWQHFPRYRTASQGPQTDS VIQNSENI KAYHSGEGHMP 
FRTGTGDIATCY S PRTSTES FAPRDS VGLAPQDSQASNILVMDH 
MIMT P EM PTME PEGGLDDS GEHFFDARE AHSDENPSEGDGAVNK 
EEKDVNLRISGNYLILDGYDPVQESSTDEEVASSLTLQPMTGIP 
AVBSTHQQQHS PQNTHSDGAI S PFTPEFX»VQQRWGAMEYS CFE I 
QSPSSCADSQSQ IMEYXHKIEADLEHLKKVEESYTILCQRLAGS 
ALTDKHSDKS 



6493 



557 



1147 



TPARMAYQGSSTS DCMSKTLDS ASAHFAAS AVVSAP VPSRS EVA 
KEQNTGHNNINGWQPSGTSKTLYSTNMALSSS PGISAVQLVRT 
VGHTTTNHLI PALCTSSPQTLPMNNSCLTNAVHLNNVSVVSPVN 
VH IN TRTS APS PTALKIiATVAASMDRVPKVTPSSAI S S IARENH 
EPERLGLNGIAETTVAMEVT 



6494 



2425 



1052 



AVAGGARPCSTPSSPHRRCRRHRPRPLPRPPAAIMSASAVYVLD 
LKGKVL I CRN YRGDVDMSE VEHFMP I LM EKEEEGMLS P ILAHGG 
VRFMW I KHNNLYLVATSKKNACVSLVFS FLYKWQVFSBYFKEL 
EEES I RDNFVI I YELLDELMD FGYPQTTDS K I LQE Y I TQEGHKL 
ETGAPRPPATVTNAVSWRSEGI KYRKNEVFLDVI BSVNLLVSAN 
GNVLR SEIVGS I KMRVFLSGMPELRLGLNDKVLFDNTGRGKS KS 
VELEDVKFHQCVRLSRFENDRTI SFI PPDGEFELMS YRLNTHVK 
PLIWIESVIEKHSHSRIEYMIKAKSQFKRRSTANNVEIHIPVPN 
DADSPKFKTTVGSVKWVPENSEIVWSIKSFPGGKEYLMRAHFGL 
PSVEAEDXEGKPPISVKFEI PYFTTSGIQVRYLKI IEKSGYQAL 
PWVRY ITQNGDYQLRTQ 



6495 



2425 



1052 



AVAGGARPCSTPSS PHRRCRRHR PRPLPRPPAAIKSASAVYVLD 
LKGKVL IOINYRGDVDMSEVEHF^PILMEKEEEGKLS PILAHGG 
VRFMWI KHNNLYLVATS KKNACVSL VPS FLYKWQVFSEYFKEL 
EEESIRDNFVII YELLDELMD FGYPQTTDSKILQF^ITQEGHKL 
ETGAPRPPATVTNAVSWRSEG IKYRKNEVFLDVIES VNLLVSAN 
GNVLRSE IVGSIKMRVTL^GMPEIJUiGLNDKVLFDNTGRGKSKS 
VELEDVKFHQCVRLSRFENDRTISFIPPIXSEraLMSYRIiNTHVK 
PLI WIES V I E KHSHSR I E YMI KAKS QFKR RS TANNVE IHI P VPN 
DADS PKFKTTVGSVKWVPENSE I VWS I KS FPGGKEYLMRAHFGL 
PSVEAEDKEGKPP I SVKFEI PYFTTSGIQVRYLKI IEKSGYQAL 
P WVRYI TQNGD YQLRTQ 



6496 



247 



559 



LRAVSLLPLQLVLPE YS IHSLFCIMFLCAQEWLTIX3LNVPLLFY 
HFWRYFHCPADSSELAYDPPVVMNADTLSYCQKEAWCKLAFYLL 
SFFYYLYCMIYTLVSS 



6497 



1053 



352 



ANTQICRLCPRRHI^PPCGAKMGNGTEEDYNFVFKVVLIGESGV 
GKTNLLSRFTRNEPSHDSRTTIGVEFSTRTVML 
TAGLERYRAI TS AYYRG AVGALLVFDLTKHQT YAVVERWLKELY 
DHAEAT I WMLVGN KSDLSQAREVPTE EARMFAENNGLLFLE TS 
ALD STNVELAFETVLKE I FAKVSKQRQNSIRTNAI'TLGSAQAGQ 



512 



WO 01/53312 



PCT/US00/34263 



SEQ 
ID 

NO: 


Predicted 
beginning, 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A- Alanine , C= Cysteine , D=Aspartic Acid, B= 
Glutamic Acid, F-= Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
LaLeucine , M^Methionine , N=Asparagine , 
P= Proline , Q=Glut amine, R=Arginine, 
S=Serine , T= Threonine , V=Val ine , 
W=Tryptophan, Y=Tyrosine, X=Dnknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=poseible nucleotide insertion) 








BPGPGEKRACCISL 


6498 


2636 


272 

♦ 


SLRLCPWGTHIiAGPTTMRI3SLl*AIiI^ 

LRVSWIQGBGEDPCVEAVGERGGPQNPDSRARLDQSDEDFKPRI 
VP YYRD PNKP YKKVLRTR Y I yrE-LG S RERLLVAVLTS RATTjST L 
AVAVNRTVAHHFPRIiLYPTGQRGARAPAGMQVVSHGDERPAWLM 
S ETLRIILHTHPG AD YDWP F I MQDDTYVQ AP RLAAIiAGHI>S INQD 
L YLGRAEE FI GAGEQAR YCHGGFG YLLSRS I*LLRLRPH1.DGCRG 
DILSARPDEWbGRCLIDSLGVGCVSQHQGQQYRSFEIiAKNRDPE 
KEGSSAFI^SAFAVHPVSEGTLMYRLHKRFSALELERAYSE I EQI* 
QAQIRNIjTA/1»TPEGFJVGLSWPVGIiPAPFTPHSRFEVIjGWDYFTE 
OHTFSCADGAPKCPLQGASRADVGDAI*EfTAI*EQtiNRRYQPRIjRF 
QKQRLLNG YRRFDPARGMEYTLDLLLE CVTQRGHRRAJjARRVSIi 
LRPI*S R VE 1 IiPMP YVTEATRVQLVI* PL»DVAE AAAAPAFLiEAFAA 
NVLEPREHAIjIjTUIjLVYGPREGGRGAPDPFIiGVKAAAAEIjERRY 
PGTRLAWIJWRAFJVPSQVRIaMDVVSIQ^ 

PE VI»NR CRMNAI SG WQAFFP VHFQ EFNPAIiS PQRS PPGP PGAGP 
DPPSPPGADPSRGAPIGGRFDRQASAEGCFYNADYLAARARLAG 
ELAGQEEEEALEGLEVMDVFLRFSGLHL.FRAVEPGLVQKFSt^RD 
CSPRLS EELYHRCRL^NLEGLGGRAQLAMALFEQEQAKST 


6499 


3 


2040 


SCSADTRPSGQAWPTVGIiRAAAGAFRTGS PIiA1/5PETPQVAC1jP ) 
GHPPVRPQVSGG?GAMPDPAAHI»PFFYGSISRAEAEEHI,ICLAGM i 
ADGLFliLRQCIiRS LGGYVI^IjVHDVRFHHFP I ERQLNGTYAIAG 
GKAHCG PAELCE FYSRDPDGLPCNLRKPCNR P SGLEPO PGVFDC 
IiRDAMVRDYVRQTWKLEGEA^EQAl I SQAPQVEKLIATTAHERM 
PWYHS SLTREEAERKLYSGAQTDGKFIjIjRPRKEQGTYALS L I yg 
KTVYHYT»ISQDKAGKYCIPEGTKFDT1iWQLVEYLKI»KADG1>IYC 
LKEACPNSS ASNASGAAAPTIjPAHPSTT*THPQRRI DTLNS DGYT 

peparitspdkprpmpmdtsvyespysdpeei^kdkklflkrdnl 
liadielgcgnfgsvrqgvyrmrkkq i dvai kvlkqgte kadte 
emmreaq i mhqldnp ytvrli gvcoaeaimlvmemaggg pl>hkf 

LVGKREE I PVSNVAEIiLHQVS MGMKYIjE EKNFVHRDLAARNVLL 
VNRHYAKISDFGLSKAIiGADDSYYTARSAGKl^PLKWYAPECINF 
RKFSSRSDVWSYGVT>1VJEALSYGQKPYKKMKGPEVMAFIEQGKR 
MECPPECPPEI*YAIiMSDCWIYKWEDRPDFLTVEQRI4RACYYSLA 
SKVEGPPGSTQKAEAACA 


| 6500 


1773 

* 


726 


TGPTHAS ADAWGliVRS VTEVJ CANVRGN P CAAALb LLk^yA VijUAt»iv. 
MLSESSS FUCGVMLGSIFCALITMLGH I RIGHGNRMHHHEHHHIj 
QAPNKED I LK I S EDERMELSKS FRVYC I 1 LVKP KDVSLWAAVKE 
TWTKHCDKABFFSSENVKVFES INMDTNDMWLMMRKAYKYAFDK 
YRDQYNW FFLiARPTTFAI I ENLKYFIJjKKDPSQP FYLGHTI KSG 
DLEYVGMEGGIVLSV3SMKRLNSLLNI PEKCPEQGGMIWKISED 
KQLAVCLK YAGVFAENAEDADGKDVFNTKSVGLS I KEAMTYHPN 
Q WEGCCSDMAVTFNGLT PNQMHVMMYGVYRLRAFGP Y FQ 


6 501 


X 


^ § \j 


LVGMSGGGTETPVGCEAAPGGGSKKRDSIiGTAGSAHLI xkdlge 
IHSRX»LDHRPVXOX3ETRYFVKEFEEKRGLREMRVLENLKNMIHE 
TNEHTXPKCRX)T^ClDSLSQ^QRI^AANDSVCRLQQREQERKKI 
HS DHLVAS BKQH^u^WDNFMKEQPNKRAEVDEEKRKAM ERLKEQ 
YAEMEKD1AKFSTF 


6502 


213 


1650 

• 


"AGNKPDPWAGRNRTAVLPDVSVFHREDVGWl'JRSWLQQSYQAVKE 
KSSEALE FMKRDLTE FTQWQHDTACT I AATASWKE KIiATEGS 
SGATE KM KKG LSD FL»G VT SDTFAPS PDKTIDCDVITLMGTPSGT 
AEPYDGTKARIjYSLQSDPATYCWEPDGPPEIiFDAWLSQFCLEEK 
KGEISELLVGSPSIRAI»YTKMVPAAVSHSEFWHRYFYKVHQI*EQ - 

EQARRDAliKQRAEQS I SEEPGWEEEEEEJUMGI SPI S PKEAKVPV 
AKISTFPEGEPGPQS PCEENLVTS VEPPAEVTPSESSES ISLVT 
Q IANPATAPEARVLP KDIjSQKIjLEASIjE EQGLAVDVGETGPS PP 
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SEQ 
ID 

NO: 


Predicted, 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A= Alanine, C=Cysteine, C=Aspartic Acid, Z~ 
Glutamic Acid, F= Phenyl alanine, G=Glycine, | 
H=Histidine, l=lsoleucine, K^Lysine, 
L=Leucine, !*=Methionine, N=Asparagine , 
P= Proline, Q=Glutamine, R=Arginine, 
S*Serine, T= Threonine, V=Valine, 
WsTryptophan, Y=Tyrosine, X=»Unknown, * -Stop 
Codon, /=possible nucleotide deletion, j 
\=possible nucleotide insertion) 








IHSKPLTPAGHTGGPEPRPFARVETLREEAPTDLRVFEIjNSDSG 
ICSTPSlWGKKGSSTDISEDVreKDFDIJD^EEEVQKALSKVDASG 
E VSGPGGSEGS EPNGPG CESS PQPAQLS PQEGPCS CLR 


6503 


213 


1650 


AGNKPDPWAGRNRTAVLPDVSVFHREDVGVIWRSWLQQSYQAVKE 
KS S HALE FMKRDLTE FTQ WQHDTACT I AATAS W KE KLATEG S 
SGATEKMKKGLSDFLGVISDTFAPS PDKTIDCDVITLMGTPSGT 
AEPYDGTKARLYSLQSDPATYOJEPDGPPELFDAWLSQFCIjEEK 
KGEI SELLVGS PS IRALYTKMVPAAVSHSEFWHRYFYXVHQLEQ 
EQARRDALKQRAEQS I SEEPGWEEEEEELMG ISPISPKEAKVPV 
AKI STFPEGEPGPQS PCF.ENLVTSVEPPAE VTPSESSES I SLVT 
Q I ANPATAPEARVLP KDIjSQKLLEAS LEEQG LAVDVGETG PS PP 
IHS KPLTPAGHTGGPE PRP PARVETLRE EAPTDLRVFELNSDSG 
KSTPS^GKKGSSTDISEDWEKDFDIJDMTEEEVQMAI»SKVDASG 
BVSGPGGSEGSEPNGPGCESSPQPAQLSPQBGPCSCLR 


6504 


2131 


1294 


GKVCIiVAHWVCLSILS PPPAGMKTPNAQEAEGQQTRAAAGRATG 
SANMTKKKVSQKKQRGRPSSQPCRNIVGCRISHGWKEGDEPITQ 
WKGTVLDQVPIN PSL YLVKYDG IDCVYGLELHRDERVLSLKILS 
DRVASSHISDANLANTI IGKAVEHMFEGEHGSKDEWRGMVLAQA 
P I MKAWFY 1 T YE KDP VL YM YQLLDDYKEGDLR IMPESS ES PPTE 
P^PGGVVDGLIGKHVEYTKEDGSKRIGMVIHQVEAKPSVYFIKF 
DDDFHIYVYDLVKKS 


6505 


2131 


1294 


GKVCLVAH WVCLS I LS PPPAGMKTPNAQEAEGQQTRAAAGRATG 
SANMTKKKVSQKKQRGRPSSOPCRNIVGCIRISHGWKEGDEP ITQ 
WKGTVLDQVP INPSIiYLVKYDGIDCVYGbELHRDERVLSLKILS 
DRVASSHISDANLANTI IGKAVEHM FEGEHGS KDEWRGMVLAQA 
PIMKAWFYITYEKDPVLYMYQLLDD YKEGDLRI MPESSES PPTE 
REPGGVVDGI.IGKHVEYTKEDGSKRIGMVIHQVEAKPSVYF I KF 
DDDFHIYVYDLVKKS 


6506 


1 


1350 


EVS P PTS CCLTVAVADPGVS EGFRGFGAGCEMPGRGRCPDCGS T 
ELVEDSHYSQSQLVCSDCGCVVTEGVLTTTFSDEGNLREVTYSR 
S TGENEQVSRS QQRGLRRVRDLCR VLQLP PTFEDTAVAY YQQAY 
RHSGIRAARLQKKEVLVGCCVLITCRQHNWPLTMGAICTLLYAD 
LDVPS S T YMQ I VKLLGLDVPSLCLAELVKTYCS S FKLFQAS PSV 
PAKYVEDKEKMLSRTMQLVELANETWLVTGRHPLPVITAATFLA 
WQSLQPADRiSCSIARFCKLANVDLPYPASSRLQELLAVLLRMA 
EQLAWLRVLRLDKRS WKH I GDLLQHRQS LVRSAFRDGTABVET 
RBKEPPG WGQGQGEGSVGNNSLGLPQGKRPAS PALLLP PCMLKS 
PKRX CP VP P VSTVTGDENI S DS EIEQ YLRT PQEVRDF QRAQAAR 
QAATSVPNPP 


6507 


1878 


929 


R5HASRLPELPSGCLVIJ2VQELVQMSGMEATVT1PIW0NKPHGA 
ARSWRRIGTNLPLKPCARAS FETLPN I SDLCLRDVp PVPTLAD 
IAWIAADEEETYARVRSDTRPLRHTWKPS PLIVMQRNASVPNLR 
GS EERIJ^ALKKPALPALSRTTELQDELSHLRSQXAKI VAADAAS 
AS L I FJJFLS PGS SNVSS PLPCFG55 r rlo 1 iarvioUi labiliVb 
VPELPSVP-IiLCSASPECCKPEHKAACSSSEEDDCVSLSKASSFA 
DMMG I LKDFHRMKQS QDLNRS LLKE ED PAVL I S EVLRRXFAL KE 
EDISRKGN 


6508 


862 


342 


WEARKRPQRWPSERREVRVPPPHLQRGRSGLEPGTPRKMAAARP 
SliGRVLPGSSVLFLCDMQEKFPJJNIAYFPQIVSVAARMLKNTTL 
DLIJ^RGLQVHWVDACSSRSQVDRLVAIARMRQSGAFLSTSEGL 
I LQLVG DAVHPQFKE I QKLIKE PAPDSGLLGLFQGQNS LLH 


6509 


2 


1053 


FVWN PRGG R KRRRQ AAVTQ AATRAS GT P S P RDGTM TOG KLS VAN 
KAPGTEGQQQVHGEKKEAPAVPSAP PSYEEATSGEGMKAGAFPP 
APTAVPLHPSWAYVDPSSSSSYDNGFPTGDHELFTTPSWDDQKV 
RRVFVRKVYT I LLIQLLVTIJVVVALFTFC35PVKDYVQANPGWYW 
ASYAWFATYLTLACC^GPRRHFPWNLILLTVPTLSKAYIjTGML 
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Prpdi c t ^rf 
beginning 
nucleotide 
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corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


■ Predicted end 
nucleotide 

i location 
c o r responding 
to first 
amino acid 
residue of 
amino acid 
sequence 


auuluw acxa segment concdining aignai peptide 
(A= Alanine, C=Cy 3 teine, D=Aspartic Acid, E= 
Glutamic Acid, F-Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=bysine, 
l*= Leucine, M= Methionine, N=Asparagine, 
P= Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X^Unknown , *-Stop 
Codon, /=possible nucleotide deletion, 








SSYYNTTSVlJ^IK5ITALVCIaSVTVFSFQTKFDFTSCQGVLFVL 
IiMTIiFFSGLI IiA I I»L P FQYVP WIjHAVYAALGAGVFTLFIJUliDTQ 
LLMGNRRHSLSPEEYIPGALNIYLDI I Y1FTFFLQLFGTNRE 


6510 

t 


37 


1156 


P CALDGCP QRG AVH P LLSSAMG LLAFLKTQ FVLKLL VG FVFWS 
GLVINFVQLCTLALWPVSKQLYRRI^CRIAYSbWSQLVMLl.EWW 
SCTECTLFTDQATVERFGKBHAVI ILNHNFEIDFLCGWTMCERF 
GVLGSSKVLAKKBI/L YVPIi I GWTWYFLE IVFCKRKWEEDRDTW 
EGLRRLSDYP E YTW F1*L YCEGTRFTETKHR VSMEVAAAKGLPVI* 
i\. X n JjJjPRTKG FTTA VKCLRGTVAA VY D VTLNFRGNKNPS LiLG I L 
YGKKYEADMCnnUlFPLEDlPLDEKEAAQWLHKLYQEKDALQEI Y 
NQKGMFPGEQFKPARRPWTLLNFIjSWATI LLS PLFSFVLGVFAS 
GS PLLILTFLGFVGAGNGHCR 


6511 

i 

i 

• 

j 


2541 


1425 


GEEQPLAAAPTECLEQVIGGAGDPGTWAS FPSPLPGPAPXjKGGK 

tmatnfsdi vkqgyvkmksrklgiyrrcwlvfrksss kgpqrlb 
kypd2ks vcltrgcp kvte i snvkcvtrlp ketkrq ava 1 1 ftdd 

! SART FTCDS E1»EAEEW YKTLS VECLGSRLND I SLGE PDIjLAPGV 
QCEQTDR FNVFX.LJPCPNU3VYGECKLQI THEN I YI»WD I HNPR VK 
LVSWPLCSLRRYGRDATRFTFEAGRMCDAGEGLYTFQTQEGEQI 
YQRVHSATLA I AEQKKRVIJ^MEKirVRLLNKGTEHYS YPCTPTT 
MLPRSAYWHHITGSQNIAEASSYAGEGYGAAQASSETDIjLNRFI 
LLKPKPSQGDSS EAKTPSQ 


6512 

* 


159 


807 


FGKKSTWFPI^RSIjRVASGRSCKLGHGGYTGSGPGFGEPRDSGA 
EVPSGSGRATGCKRGGVRGARQGRAPGSS I WRKEPRMVCTRKTK 
TLVSTCVILSGMTNI I CI*Ij YVGWVTNYI AS VYVRGQE PAPDKKL 
EEDKGDTLKI IERXtDHLENVIKQHIQEAPAKPEEAEAEPFTDSS 
LFAHWGQELSPEGRRVALKQFQYYGYNAYLSDRLPLDRP 


6513 

t 
• 

t 

! 
* 

* 


2 


756 


FVS PE PGFSIiAQLNIi I WQIiTDTKQLVHS FAEGQDQGSA YANRTA 
LFPDLLAQGKASI^RIjQRVRVADEGS FTCFVS I RD FGSAAVS L&V 
AAPYSKPSMTLE?NKD1*RPGDTVTITCSSYQGYPEAEVFWQDGQ 
GVPLTGNVTTSQMANEQGLFDVHS I LRWLGANGT YS CLVRN P V 
UQQDAHSS VTITPQRS PTGAVEVQ VP EDP VV7UjVGTDAT1iK LTb> r 
SPEPGFSLAQLNIil WQLTDTBCQLVHS FAEGQDQGS AYANRTAL F 
PDLLAQGNAS LPJLQRVRVADEGS FTCFVSIRDFGSAAVSLQVAA 
PYSKPSMTLEPNKDLRPGDTVTITCSSYQGYPEAEVFWQDGQGV 
PLTGNVTTiSQMANEQGIjFDVHS I LR WIiGANGT YS CLVRN PVIiQ 
QDAHSS VTITPQRS PTGAVEVQVPEDP WALVGTDATIiRCS FS P 
EPGFSLAQLNIjIWQLTDTRQLVHSFTEGR 


6514 


985 


302 

• 


VGIPGPTISSAAEMEDLJUDLOEELRYSLATSRAKMGRRAQQESA 

KASEEI EDFRIiRPQSLNGSDYGGDI PI I PDLEEVQEEDFVMVA 
APPSIQIKRVOTYRDLDNDI^KYSAIQTLIXSEIDLKLLTKVLAP 
EHEVRERNPS WQDDVGWDWDHliFTEVSS EVLTE WDPLQTE KEDP ! 
AGQARHT 


6515 | 

1 
t 
t 

* 

i 
i 
i 

i 
» 


1345 

* 


305 


GRVGS RRRGAAV PGGCGAGSTQLE VS AS AS C5G ALG S ADMNP I W 
VHGGGAGPI S KDRXERVHQGIWRAAT VG YG ILREGGS AVDAVEG 
AWALEDDPEFNAGCGSfVIJQTNGEVEMDAS IMDGKDLSAGAVS A 
VQCIANP I KLARI>VMEKTPHCFL.TDQGAAQFAAAMGVPEI PGEK 
L VTERNKKRIjE KE KHZKGAQKTD CQ KNLGTV GAVAliD CKGNV AY 
AT3TGGI VNKMVGR VGDS PCLGAGGYADND I GAVS TTGHGES I L 
KVNIiARLTLFHIETCXTVEEAADLSIiG YMKSRVKGLGGLI WS K 
TGDWVAKWTSTSMPWAAAKIXSKLHFGIDPDDTTITDLP 


6516 

* 
* 

: 
t 

t 


1 


1402 


FRPJLRYIX5QDATAAARDIJRTRGU2GYCPSATARQQVLVSALQQL 
KGRRSEHRNENQEMPYSTNKELIUSIMVGTAGISLLiLVTYHKVR 
KPGI AMKL PEFLS LGNTFNS I TLQDE I nDDQGTTV I FQERQI^ I 
LEKLNEIiLTNMEBLKEEIRFLIOSAIPK^ 

ISPQHRARKRRLPTIQSSATSNSSEEAESEGGYITANTBTEEQS 
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t SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corr e s ponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine , D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G -Glycine, 
H=Hastiainc, I=Isoleucme, K=Lysine, 
L= Leu cine, M=Methionine , N^Asparagine, 
P=Proline, Q=Glut amine, R=Arginine, 
S= Serine, T=Threonine, V= Valine, 
W=Trypt ophan , Y= Tyro sine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\»possible nucleotide insertion) 


• 






FPVPKAFNTRVEELNIjDVIiIA3KVDHLRMSE 

EKFRDEIEFMWRFAi^YGDMYEI-»STNTQSKKHYANIGKTXiSERA 
X7TOAPMTTGHCHLW YAVIXX5 YVSEFEGl^NKI^ IA 
I KT iT tPEE P FLYYLKGR YCYTVS KLSW I EKKMAATLFGKI PSSTV 
QEALKNFLKAEELC PG YSN PNYMYLAKCYTDLE BN QNALKF CNI» 
ALLLPTVTKEDKEAQKEMQKIMTSLKR 


6517 


3 


1414 

* 


GRVWGGS S SLNAMVYVRGHAED Y ERW QRQGARGWDYAHCLP Y FR 
KAOGHELGASRYRGADGP LRVS RGKTNHPLHCAFI.EATQQAG YP 
LTEDMNGFQQEG FGWMDMT I HEGKRWSAACAYLHPALSRTNLKA 
EAETLVSRVLFEGTRAVGVEYVKNGQSHRAYASKEVILSGGAIN 
S PQLLMLSG IGNADDLKKLGI P WCHLPGVGQNLQDHLEI YIQQ 
ACTRPITIxH5AQKPlJlKVCIGLEVnL;WKFTGEGATAHl»ETGGFIR 
SQPGVPHPDlQraFLPSQVIDHGRVPTQQEAYQVHVGPMRGTSV 
GWLKLRSANPQDHPVI QPNYLSTETDI EDFRLCVKLTRE IFAQE 
ALAP FRGKELQPGSHI QSDKE IDAFVRAKADSAYHPSCTCKMGQ 
PSDPTAVVDPO/TRVLGVENLRVVIIAS IMPSMVSGNLNAPTIM IA 
EKAADX I KGQPALWDKDVPVYKPRTLATQR 


6518 


242 


1098 


PAWNPGSBPRTRVRPRARSFPLPPPRAPRRRRHRLLRAVPGPSR 
RHRCRRRAP P PPSTMGDAGS ERS KAPSLPPRCPOGFWGS SKTTCN 
LCS KCFADFQKKQPDDDSAPSTSNSQSDLFSEETTSDNimTS IT 
TPTI^PSQQPIjPTEIiNVTSPSKEECGPCTDTAHVSLITPTXRSC 
GTDSQSENEAS P VKRPRLXjENTERS EETSRS KQKSRRRCFQ CQT 
KIiELVCX2El^SCRCGYVFCMLHRliPEQHDCTFDHMGRGREEAIM | 
KMVKLDRKVGRS CQRIGEGCS 


6519 


3 


1113 


ERKMAKPPSPVHCVAAAAPTATVSEKEPFGKLQLSSRDPPGSLS \ 
AKKVRTEEKKAPRRVNGEGGSGGNSRQLQPPAAPSPQSYGSPAS 
WSFAPLSAAPS PSSSRSS FSFSAGTAVPSSASASLSQPGPRKtiL 
VTPTI^HAQPHHLLLPAAAAAASANAKSRRPKEKREKERRRHGL 
GGAREAGGASREENGEVKPIiPRDKIKDKIKERDKEKEREKKKHK 
VMNEIKXFJ^EVKILLKSGXEKPKTNIEDLQIKKVKICKKKXKHK 
ENEKRKRPKMYSKSIQTICSGLLTDVEDQAAKG I LNDN I KD YVG 
KNLDTKN YDSKI PENSB FP FVSLKE P R VQNNL KRlrDTLE F KQL I 
HIEHQ PNGGAS VIHCLQ 


6520 


3 


1113 


ERKMAEPPSPVHCVAAAAPTATVSEKEPFGKLQLSSRDPPGSLS 
AKKVRTEEKKAPRRVNGEGG SGGNS RQLQPPAAPS PQSYGSPAS 
WSFAPLSAAPSPSSSRSSFS FS AGTAVPSSASASLS QPG PFJCLL 
VPPTLLmQPHHLIiPAAAAAASANAKSRRPKBKREKERRRHGL 
GGAREAGGASREENGEVKPL PRDKI KDKIKERDKEKEREKKKHK 
VM^IIGiCENGEVKII>LKSGKEKPKlWIEDLQIjKKVKKKKKKKHK 
ENEKRKRPKMYSKS IQTICSGIiTDVEDQAAKGIIjbNIKDYVG [ 
KNLDTKNYDSKI PENSEFP FVSLKEPRVQNNLKRLDTLEFKQliI 
HIEHQPWGGASVIHCLQ 




T Q A 

loft 

m 


1 ""7 O Q 

1 /9o 


KLFKT^TDTSQGELVHPKALPLIVGAQLIHADK^ 
I RRTVNSTRETP P ICS KLAEGEEFKPE PD T SS EES VSTVEEOFNE 
TP PATS S EAEQPKGEPENEE KEENKS S EETKKDEKDQS KEKEKK 
VKKTIPSWATLSlASQIiARAQKQTPMASSPRPKMDAILTEAIKAC 
FQKSGASWAIRKY 1 1 HKYP S LELERRGYLLKQALKRELNRGVI 
KQVKGKGAS G S FVWQ KSRKTP Q KS RNRKNRSS AVDP E PQVKLE 
DVLPLAFTRLCEPKEASYSLIPJCWSQYYPKLRVDIRPQLI/KNA 
U2RAVERGQLEQITGKGASGTFQLKJCSGEKPIJ^SIJ*1EYAII>S 
AIAAMNE PKT CSTTAIjKK YVI^NH PGTNSNYQMHLL K^TLrQKCB 
KNGWMEQ I SGKGFSGTFQLCFP YYPS PGVLFPKKEpDDSRDEDE 
DEDESSEEDSEDEEPPPKRRLQIOCTPAKSPGKAASVKQRGS KPA 
PKVSAAQRGKARPLPKKAPP KAKTPAKKTRPSSTVIKKPSGGSS 
KKPATSARKE 


6S22 


1042 


391 


NKWI^SPRSHRTPESGRVLSLFRIjPPPGMALSGSTPAPCWEED 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corre spondi ng 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A*»Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G^Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
I»=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine , 
S= Serine , T=Threonine, V=Valine, 
W=Tryptophan , Y=Tyrosine, X= Unknown, *=>Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








ECLD Y YGMLS LHRM FE WGGQLTECELE LLAFLLDE APGAAGGL 

SRARSGLKLLLELERRGG<n)ESNLRIjIX;QL^ 

RKRRRP VS PERYSYGTS SSS KRTEGS CRRRRQSS SS ANSC3QGS P 

PTKRQRRSRGRPSGGARRRRRGPQPHPSSSQSPPDLPLKAK 


6523 


2 


1097 


ASCQTRRRTAALDS GER IAGRRS P XAliAMASNFND I VKQG YVKI 
RSRKLGI FRRCWL VTKKASS KG PRRLEKFPDEKAAYFRNFHKVT 
ELHNIKNI TRLPRETKKHAVAI I FHDETS KT FACES ELEAEEWC 
KHLCMECLGTRLNDIS bGEPDIiIiAAGVQREQNERFNVYLMPTPN 
LD I YGECTMQ I THEN I YLWD IHNAKVKLVMWPLSS LRRYGRDST 
V7FTFESGRMCDTGEGLFTFQTREGEMI YQKVHSATLAI AEQHER 
IiMLEMEQKARLQTSIiTEPMTLSKSISIiPRSAYWHHITRQNSVGE 
I YSI^QGNHENRHSDIiTGKSCKTS ENRFLEENAPLVMYG I THHLF 
MDTSTCKWHDLE 


6524 


2 


1097 


AS CQTRRRTAAIiDSGBRIAGRRS P IALAMASNFNDI VKQG YVKI 
RSRKLGIFRRCmiVTKKASSKGPRRliEKFPDEKAAYFRNFHKVT 
ELHN I KN I TRL PRE TKKHAVAI I FHDETS KTFACES ELEAEEWC 
KRljC^EC^GTRLNDISliGEPDLlJ^GVQREQNERFTSr^m^MPTPN 
IJ5IYGECTMQITHENIYX.WDIH1JAKVKLVMWPLSSLRRYGRDST 
WFTFESGRMCIKrcEGLFTFO/rREGEMIYQKVHSATI^IAEQHER 
LMLEMEQ KARLQTS LTE PMTLS KS I S LPRS AYWHH I TRQNSVGE 
IYSLQGNHENRHSDLTGKSCKTSENRFLEENAPLVMYGITHHLF 
MDTSTCKWHDIjE 


6525 


1 

• 


1859 


GESPFSEEESIEFNPSSSGRSARTVSSNSFCSDDTGWPSSQSVS 
PVKTPSDAGNSPIGFCPGSDEGFTRKKCTIGMVGEGSIQSSRYK 
KES KSGLVKPGSEADFS SSSSTGS I SAPEVHMSTAGSKRSSSSR 
NRGPHGRSNGASSHKPGSSPSSPREKDLLSMLCRNQLSPVNIHP 
SYAPSSPSSSNSGSYKGSDCSPIMRRSGRYMSCGENHGVRPPNP 
EQYLTPUX3KEVTVRHLKTKLKESERRLHERESEIVELKSQLAR 
MRE DW IEEE CHRVEAQ LALKEAR KE I KQL. KG; V I ETMRS S LADKD 
KGIQKYFVDINIQNKIO^ESLIjQSMEMAHSGSLRDEI.CLDFPCDS 
PEKSIiTLNPPLDTMADGI*S LiEEQVTGEGADRELLVGDS I ANSTD 
LFDEIVTATTTESGDIJS1*VHSTPGANVLELLPIVMGQEEGSVW 
ERAVQTDWP YS PA I SE I»I QSVLQKLQDPCPS S LAS PDES EPDS 
ME S FPESLS AL WDLTPRNPNSAI LLS PVETPYANVDAEVHANR 
LMRELDFAACVEERLDGVI PLARGGWRQYWS S S FLVDLLAVAA 
P W PT VL WAFS TQRGGTD P VYNI G ALLRG CCVVALHS LRR T AFR 
IKT 


6526 


2 


2034 


SGRAGEPEE WRGRQ 1 1 DS KETW I P FNS EDSQQLE EAYSSGKGCN 
GRWP TDGGRYDVHLGERNRYAVYWDELAS E VRRCTWFYKGDKD 
NKYVPYSESFSQVLEETYMLAVTLDEWKKKLES PNREII ILHNP 
KLMVHYQPVAGSDDWGSTPMEQGRPRTVKRGVENISVDIHCGEP 
U3IDHLVFVVHGIGPACDLRFRSIVQCVNDFRSVSLNIjLQTHFK 
KAQENQQ IGRVE FIiPVNWHS PLHS TGVDVDLQRI TLPS INRLRH 
FTWDT I LDVFFYNS PTYCQTIVDTVAS EMNR I YTLFLQRNPDFK 
GGVSIAGHSIiGSLILFDILTWQKDSLGDIDSEKGSLNlVMDQGD 
TPTLEEDLKKLQLSEFFDI FEKEKVDKEALALCTDRDLQEIGIP 
LGPRKKIIiNYFSTRKNSMGIKRPAPQPASGANIPKESEFCSSSN 
TRNGDYLDVGIGQVS VKYPRLI YKPE I FFAFGS P IGMFLTVRGL 
KRIDPNYRFPTCKGFFNIYHPFDPVAYRIEPMWPGVEFEPMLI 
PHHKGRKRMHLELREGLTRMSMDLKimiJjGSLRMAWKSFTRAPY 

PALQAS ETPEETEAEPES TS EKPSD VNTEETS VAVKEE VLP I NV 
GMLNGGQRIDYVLQEKPIESFNEYLFALQSHLCTWESEDTVLLV 

LKEIYQTQGIFLDQPLQ 


6527 


1 


922 


GWVPLLSRILPSDACKIYKQGINIRI^TTLIDFTDMKCQRGDLS 
FIF^GDAAPSESFVVLDNEQKVYQRIHHEESEMETEEEVDILMS 
SDIYSATLSTKS ISFTRAQTGWLFREDKTERVGNFLADFYLVNG 
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NO: 
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beginning 
nucleotide 
location 
corre sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A-Alanxne , C- Cysteine, D=Aspartxc Acad, E- 
Glutamic Acid, F= Phenylalanine, G-Glycine, 
H*=Histidine, I=Isoleucine, K=Lysine, 
L= Leucine, M=Methionine, N=Asparagine, 
P= Proline, Q=Glutamine, R=Arginine , 
S= Serine, T= Threonine, V=Valine, 
W tryptophan, Y=Tyxosine, X=Unknown, *=Stop 
Codon, /=pos sihle nucleotide deletion, 
\-possible nucleotide insertion) 








LVl^SPJCRREHl^El^I LJ^KAlriKS bSK^ 
SLTPPP<5NTITW2EYISAENGKAPHLGREI*VCKESKKTFKATIA 
MSQEFPLGIELIJjNVLBVVAPFTaiFNKIjP^FVQMKXjPPGFPV^ 
DI PVFPTTTATVTFQEFRYD3FDGS I FTIPDDYKEDPSRFPDL 


6528 


1 


1073 


LTGPAAAEPRCAADAGMKJiAl^RRKGVWLRIJlKILFCVI^LYlA 
I PFL I KLCPG IQAKLI FLNFVRVPYFIDL.KKPQDQGLNHTCNYY 
LQPEE DVT IGVWHTVPAVWW;<NAQG KDQMWYEDALASSHP I IL»Y 
IjHGNAGTRGGPHR VELY KVLS S LGYHVVTFDYRG WGDSVGTPSE 
RGMTYDALHVFDW I KARSGDNP VYI WGHSLGTGVATNLVRRLCE 
RETPPnALILBSPFTNIREEAKSHPFSVIYRYFPGFDWFFLDPI 
TSSG I KFANDENVKHI S CPLLI LHAEDD P WPFQLGRKLYS IAA 
PARS FRDFKVQFVPFHSDLGYRHKY I YXSPELPR ILREFLGKSE 
PEHQH 


6529 


363 


2215 


THIRYNKIGWKTMSCGNEFVETLKKIGYPKADNLNGEDFDWLF 
EGVEDES FLKWFCGNVNBQNVLS ERELEAFS I LQKSGKP I LEG A 
ALDEALKTCKTS DLKTPRLDDKELEKLEDE VQTLLKLKNLKI QR 
RNKCQIJ^ASVTSHKSLPJ^NAKEEEATKKLKQS^ILNAM 
NEIjQALTDEVTQLMMFFRHSNLGQGTNP lvflsq FSLEKYLSQE 
EQSTAALTLYTKKQFFQGIHEWESSNESQFFNFLKIQTPS I CD 
NQEI LEERRLEMARLQLAYI CAQHQLIHLKASNSSMKSS I KWAE 
ESLHSLTSKAVDKENIiDAKISSLTSElMKLEKEVTQIKDRSLPA 
WRENAQ LLNM P WKGD FD LQ IAKQD YYTARQE L VLNQL I KQKA 
SFELLQLSYE I ELRKHRD I YRQLENLVQELSQSNMML YKQLEML j 
TDPSVSQQINPRNTIDTKJDYSTHRLYQVLEGENKKJCELlFLTHGN 
LEEVAEKLKQNI SLVQDQLAVSAQEKS FFLSKRNKDVDMLCDTL 
YQGGNQLLIjSDOELTE^raKVESQIiNKLNHLLTDILADVKTKRK 
TLANNKLHQMERE FYVYFLKDEDYLKDI VENLETQS KI KAVS LS 
D 


6530 


1 128 


2986 


GAAHHGAIVQVHPLLPGSSTIMIHDLCLVFPAPAKAWYVSDIQ 
ELYI RWDKVE IGKTVKAYVRVLDLHXKP FLAK YFPFMDLKLRA 
ASPI ITLVALDEAIJDNYTITFLIRGVAIGQTSLTASVTNKAGQR 
I NSAPQQ I EVFP P FRLMP RKVTLL IGATMQ VTS EGG PQ FQSN IL 
FS I SNES VALVS AAGLVQGLAI GNGTVS GLVQAVDAETGKWI I 
SQDLVQVEVLLLRAVRIRAP Z MRMRTGTQMP I YVTG I TNHQN P F 
S FGNAVPGLTFHWS VTKRDVLDLRGRHH EAS I RLPSQYNFAMNV 
LGRVKGRTGLRAVVKAVDPTSGQLYGLAIiEIiSDEIQVQVFEKiQ 
I^PEIEAEQII^PNSYIKIXiTNPJXlAASLSYRVlJDGPEKVPV 
VHVDEKGFLASGSMIGTSTI EVIAQEPFGANQTI I VAVKVS PVS 
YLRVSMS PVliHTQNKEALVAVPLGMTVTFTVHFHDNSGDVFHAH 
S S VLNFATNRDDFVQ I GKG PTNNTCWRTVS VGLTLLRVWDAKH 
PGLSDFMPLPVLQAI S P ELS GAMWGD VL CLATVLTS LEGLSGT 
WSSSANS I LHIDPKTGVAVARAVGS VTVY YE VAGHLRTYKE VVV 
S VPQR I MARHLH P I QTS FQBATAS KV I VA V GDRS SNLRGECTPT 

QYFCS ITMHRLTDKQRKHLSMKKTALWSAS LSSSHFSTEQVGA 
SVPFS PG LFADQAE I LLSNHYTSSE IRVFG APEVLENLEVKSGS 
PAVLAFAKEKSFGWPS FITYTVGVLDPAAGSQGPLSTTLTFSSP 
VTNQAIAI PVTVAFVVDRRGPGPYGASLFQHFLDSYQVMFFTLF 
ALLAGTAVMIIAYHTVCTPRDLAVPAALTPRASPGHSPHYFAAS 
SPTSPNALPPARKASPPSGLWSPAYASH 


6531 


845' 


1425 


PSASIPPSASPDPVPDIRTCHFCLVEDPSVGCISGSEKCTISSS 
SLCMVITIYYDVKVRFJVRGCKQYISYRCQEKRNTYFAEYWYQA 
QCCQYDYCNSWSS PQLQSSLPEPHDRPLALPLSDSQI QWFYQAL 
Nl^IiPLPNFHAOTEPDGI^PMVTLSLNLGLS FAELRRMYLFLNS 
SGLLVLPQAGLLTPHPS 


6532 


2 


954 


AAGPPSEWNQDS LFPEPEPGPAPQVLLGPQGPGL I KG VAPPTL 
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ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid ' 
sequence 


Predicted end 
nucleotide 
location 
corre spending 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=sAlanine, C= Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G^Glycine, 
H=Histidine, I=rIsoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glut amine, R-Argonine, 
S-Serine, T^Threonine, V=* Valine, 
WaTryptophan. Y=Tyrosine , X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








ITDSTGTHLVTLTVTNKNAHSPGLSRGSPQQPSSQPGSPAPAPSA 
QMDLEHPIjQPIjFGTPTSIiljKKEPPGYEEAMSQQPKQQENGSSSQ 
O^DLroiLIQSGEISADFKSPPSLPGKEKPSPKTVCWSPLAAQ 
PSPSAELPQAAP P P PGS P SLPGRI*ED FIiESSTGLPLiIjTSGHDGP 
EPLSIilDDLHSQI^SSTAIIiDHPPSPMDTSEIjHFVPEPSSTMGI* 
DLADGHLD S MD WLELS SGG PVLSLAPLSTTAPSLFSTDFTjDGHD 
LQLHWDSCL 


6533 


1798 

- 


373 

• 


STISWliARVEPPRRSSGVGAAlUiRFPGGSRPLRARACVLAI*AVIj 
ALLERNNADSMSAHSMIiCER I AIAKEI»I KRAESLS RSRKGGI EG 
GAKLCSKLKAELKFIXJKVEAGKVAIKESHIiQSTNLTHl^RAIVES 
AENI^E^A7SVLHVFGYTDTI/5EiCQTLVVDWANGGHTWVXAIGR 
KAEALHNI WLGRGQYGDXS I 1 EQAEDFjLQASHQQPVQYSNPHI I 
FAFYNS VS S PMAE KLKEMG I S VRGDXVAVNALLDHPEEX.Q PS ES 
ESDDEG PELLQVTRVDREN I LASVAF PTE I KVDVCKR VNUD I TT 
LITYVS AI>S YGG CHF I FKE KVLTEQAEQERKEQVLPQIjE AFMKD 
KELFACESAVKDFQSJLDTLGGPGERERATVLIKRINVVPDQPS 
ERALRLVAS S KINS RSLTI FGTGOTLKAITMTANSGFVRAANNQ 
GVKFSVFIHQPRALTESKEAIATPLPKDYTTDSEH 


6534 


47 


59«J 


"katrfisaafwlnkqgvspakcphtswswslqtlsflfsgdla 
ekslqcfpcsamlleiil pllgi hfvlrtaraqsvtqpdih itvs 
egaslelrcotsygatpylfwmertveeapillvclkpwrvass 
lekkekedesfqliilgsrynvijkahciilplirwltsgdsllsao 

PHCPQGL 


6535 


250 


964 


LIKTFFRDVAIQRDLLPKEKNIjETLLTLAFLEIDKAFSSHARLS 
ADATIiLTSGTTATVALLRDG I EL WAS VGDSRAILCRKGKPMKL 
TIDHTPERKDEKERI KKCGGFVAWNSI^GQPHVNGRLAMTRS IGD 
LDLKTSGVI AE PETKRI KLHHADDS FLVLTTDG INFM VNSQE I W 
D FVNQ CHDPNEAAHAVTEQAI Q YGTEDNS TAWVP FGAWG KY KN 
SEINFSFSRSFASSGRWA 


6536 


242 


1174 


SbVKEMTNQYGILFKQEQAHDDAIWSVAWGTNKKENS ETWTGS 
LDDLVKVWKWRDERIiDLQWSLEGHQLGVVSVDISHTLPIAASSS 
LDAHIRLWDLENGKQ IKSI DAGPVDAWTIiAFSPDSQYlATGTHV 

GKVNI FGVESGKKEYS LJXTRGKF I LS I AYS PDGKYLAS GA I DG I 
INI FD IATGKLiHTLEGHAMP IRSLTPS PDSQLLVTASDDGYI K 
IYDVQHANI^TLSGHASVTVLNVAFCPDDTHFVSSSSDKSVKVW 

DVGTRTCVHTFFDHQDQVWGVKYNGNGSKI VSVGDDQE IH I YDC 
PI 


6537 


1638 


921 


NRFNPPPTQGPDPSLVYRPDVDPEVAKDKASFRNYTSGPLLDRV 
FTTYKLMHTHQ/TVI) FVRSKHAQFGGFS YKKMTVME 
DESDPDVDFPNSFHAFQTAEG IRKAHP DKDW FHkVGIjIjHDiAjrKV 
LALFGEPQWAVVGDTFPVGCRPQASVVr v,ut> I r fjujN r uiaju^ « i 
STEIX3MYQPHCGLDRVLMSWGHDGEARGGQWGGGGRWGTVGGGG 

AEAVP AGDTL.S PQSTCTR 


6538 


3345 


* 


PYT,YnPT.DAL,rTCOTAPEEAF I KIjDGIjAGMLTEQI*RRI>TKQVQE 
ARHNRDDEAI KKAVNE YDETMEK YI PVLMAQAKI YWNI»ENYPMV 
EKI FRKSVEFCNDHDWKLNVAHVLFMQENKYKEAIGFYEPIVK 
KHYDNTLNVSAIVLANLCVSYIMTSQNEKAEEIiMRKIEKEEEQI* 
SYDDPNRKMYHLCIVNLVIGTLYCAKGNYEFGISRVI KSLEPYN 
KKIXTTDTWYYAKRCFLSIiLE2WSKHMIVIHDSVIQECVQFiySHC 
ELYGTW IPAVI EQPLEEERMHVGIG^TVTDESRQIjKAI* I YE I IGW 

NK 


6539 


218 


339 


"ficaasphphfssi^hpdqpeftpvqdelfju«elwgpgv 


6540 


3 


391 


LERLWT J J iT iRRP EDAMAEC PTLGEAVTUHPDRIjW AWEKFVYXiDE i 
KQHAWLPIjT I E I KDRI^I^VLIJRR ED VVIX3RPMTPTQ I G P S LL> P 
IMWQI,YPDGRYRSSDSSFWRX,VYH I KIDGVEDMLLELLPDD 
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SEQ 
ID 
NO : 


Predicted 
beginning 

mi >/»l pni" i Hp 

location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid. F* Phenylalanine, G=Glycine. 
H^Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T-Threonine, V=Valine, 
W= Tryptophan, Y=Tyrosine, X=Unknown , +=Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 


6541 


1165 


53 6 


RTLVQRRILMLJjRKPARGRDLRGRGRGTPRGGRKGtiL.Fl'PDEFP 
RFEGGRKPDSWDGNREPGPGHEHFRDTPRPDHPPHDGHSPASRB 
RSSS LQGMBMASLP PRKRPWHDGPGTSEHREMEAPGGPSEDRGG 
KGRGGPGPAQRVPKSGRSSSLDGEHHDGYHRDEPFGGPPGSGTP 
SRGGRSGSNWGRGSNMNSGPPRRGASRGGGRGR 


6542 


3 


^ 3775 


SWPRGRGETGGHPGAIiRTRTMQKSVRYNEGHAliYIiAFIiARKEGT 
KRGFI*S KXTAEAS R WHEKW PAL YQNVL FY FEGEQS CR P AGM YKL 
EG CS CERT PAP PRAGAGQGGVRDALDKQ YY FTVL.FGHEGQKPLE 
IjRCEEFrQDGKE^MEAIHQASYADILIEREVLMQKYIHLVQIVET 
EK I AANQIiRHQLEDQDTE I ERLKS2 1 IALN KTKERMRP YQSNQ E 
DEDPDI KK I KKVQSFMRGWLCRRKWKTIVQDYI CS PHAESMRKR 
NQIVFTMVEAESEYVHQLYILVNGFLRPLRMAASSKKPPISHDD 
VSSI FUN SET XMFLHE I FHOGLKAR I ANW PT1» I LADL FDILLPM 
LNI YQEFVRNHQYS LQVLANCKQNRDFDKLI*KQYEANPACEGRM 
LET FTjTYPMFQ I PRY I ITLHELLMTPHEHVBRKSLEFAKSK1.E 
ELSRVMHDEVSDTENIRKNLAIERMIVEGCD I ULDTSQTFIRQG 
SLI QVPSVERGKLS KVRLGSLSLKKEGERQCFL.FTKH FLICTRS 
SGGKIJ1LI*KTGGVLSI*IDCTLIEEPDASDDDSKGSGQVFGHLDF 
KJVVEPPDRAAFTVVLLAPSRQEKAAW^DISO^/DNIRCNG^ 
TIVFEENSKVTVPHMIKSDARI^KDIXrDICFSKTLNSCKVPQlR 
YAS VERLLERliTDLRFIiS IDFLNTFLHTYRI FTTAAWTjG KLS D 

X X Ivxv tr L X »>d _L XT y p. ITT 1 f i?r ' 1 F r AX oyiWAubuu viAJiwrivuv> uor 

PPLAVSRTSS PVRARKI^LTSPIiNSKIGALDLTTSSS PTTTTQS 
PAASPPPHTGQIPIJ3LSRGLSSPEQSPGTVEENVDNPRVDLCNK 
LKRSIQKAVLESAPADRAGVESSPAAiyrTELSPCRSPSTPRHIuR 
YRQPGGQTADNAHCS VSPASAFAIATAAAGHGS P PGFNNTERTC 
DKEFI IPJ*TATNRVLNVLRHWVSKHAQDFEL^ 
EVLRDPDI^PQERKAAANI LMALSQDDQDD I HliKLED 1 1 QMTDC 
MKAE CFES LSAMELAEQ I TLLDHV I FRS IPYEE FIjGQGWMKIiDK 
NERTP YI MKTSQHF1TOMSNL\^QIM1^ADVSSRANAIEKWVAV 
ADI CRCLHNYNGVLE I TS ALNRSAI YRLKXTWAKVS KQTKALMD 
KIjQKTVSSEGRFKI^RETLKNCNPPAVPYLGMYLTDLAFI EEGT 
PNFTEEGLVNFSKMRMISHIIREIRQFQQTSYRIDHQPKVAQYL 
LDKDLUDEDTLYELSLKIEPRLPA J 


6543 


1857 


950 

• 


FVSGCGRAG I GL.S WAMAAE AR VSR WY FGGLAS CGAACCTH PIiDL* 
LKVffl^TQQEVKLRMTGMALRVVRTDGI LALYSGLSASLCRQMT 
YSLTRFAI YETVn^RVAKGSQGPLPFB^KVLLGSVSGLAGGFVG 
TPADlVirNrVRMQNDVKIjPQGQR 

S GATMAS S RGAliVTVGQIiS CYDQAKQLVIiSTGYLSDNI FTHFVA 
SFIAGGCATFLCQPIiDVLKTRIjMNSKGEYQGVFHCAVETAKLGP 
LAFYKGLVPAG I RLI PHTVLTFVFLEQLRKNFG I KVPS 


6544 


630 


79 


PSPCF I RSRLDGQPWMAGLEAWLSQNFSLHQPQSR VR VRRAS I S 
E PSDTD PEPRTItNPS P AGWFVQQHPBIjE LMSS FRERFGRNWLQ Y 
RSKiEPSGNPLPATPTTSAPSAPPASSQGPDTAPRPSPPQEEAR 
GPQESPQKMSEETVRAEPOEEEEEKEGKEEKEEGEMAPLPEAHl.G , 

EGKQKECP 


6545 


176 


560 


P PHSHAALLP AAMTPIjI/TL I LWLMGLPLAQALJDCHV CAYNGDN 

CF'NPMRCPAMVAYCMTTRTYYTPTRMKVSKSCVP 

S KHAS TTS CCQ YDLCNGTGLAT PATLALAP I LLATL WGLL 


6546 


1657 


364 


KliLNGIjDEVAAFFA^LGAIVRKHFCFLKCI^PRVRPFYAVKCWS 
SPGVLKVlAQIoGLGFSCANKAEMEbVQHIGIPASKI I CANPCKQ 
IAQIKYAAKHGIQLI^FDNEMELAKVVKSHPSAKM^ 
HSLSCLSLKFGVSLKSCRHlxLENAKKHK^EVVGVSF^ 
PQAYAQS IADARLVFEMGTEl^HKMHVLrJtXSGGFPGTEGAKVR F 
EEIASVINSALDLYFPEGCGVDIFAELGRY YVTSAFTVAVS I IA 
KKEVLLDQ PGREEENGS TS KT I VYHLDEGVYG I FNS VLFDSTI CP 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corre spending 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A»Alanine, C=Cysteine, D^Aspartic Acid, E= 
Glutamic Acid, F- Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L= Leu cine, M= Methionine, N=Asparagine, 
P= Proline , Q^Glut amine, R=Arginine, 
S=Serine, T= Threonine , V-Valine, 
W=Tryptophan, Y=Tyrooine, X=Dnknovra, *=Stop 
Codon, /^possible nucleotide deletion, 
\=poasible nucleotide insertion) 








TP I I^KKPST EQPLYSS SLWGPAVDGCDCVAEGIjWIi PQ LHVG D W 
LVFDNMG AYTVGMGS P FWGTQACH I TYAMS RVAWEALRRQLMAA 
EQEDDVEGVCKPLSCGWEI TDTLCVGP VFTPAS IM 


6547 

• 


1 


541 


IJISKYI^PALCSQPGMMRCCRRRCCCRQPPHALRPLLLLiPbVIjL 

PPIiAAAAAGPNRCDTIYQGFAECLIRIiGDSMGRGGELETICRSW 

NDFHACASQVT^GCPEELAAAWESLQQEARQAPRPNNL^ 

P VHVRERGTGS ETNQETIiRATAPAIjPMAPAP PLIAAAIAIAYI^ 

RPLA 


6548 


2 


219 


FVS RLS VRDVRFPTFLGGHGADAMHTDPD YS AAYVP I ETDAEDG 
I KGCGI TFTLiGKG TEVGEXJCI LSRFQNA 


6543 


73 


1490 


ETGRVCEDARPACGSRS RRRRKEAAPGI PTPS PSSSSPTSSRPA 
ARAFSKAPARI^RPRAREEPPDPGRRYIQEEI IQARKHKLIKMC 
SSVAAKLWFIiTDRRIREDYPOKEILRALKAKCCEEELDFRAWM 
DEWTjTI EQGNLGLRINGEL I TAYPQWWRVPTPWVQSDSDIT 
VIjRHI*EKMGCRLMNR PQAI J^CVNKFWTFQELAGHGVPIj PDTFS 
YGGHZNFAKMIDEAEVLEFPMWKNTRGHRGKAVFLARDKHHIA 
DI^HLIRIiEAPYLFQKYVKESHGRDVRVIVVGGRVVGTMLRCST 
JXSRMQSNGSIXSGVGMMCSLSEQGKQIAIQVSNILGMDVCGIDIiL 
MKDDGSFCVCEANANVGFIAFDKACNLDVAGI IADYAASLLPSG 
RLTRRMSLLS WSTASETS EPELGP PASTAVDNMS ASS S SVDS D 
PBSTERELLTKLPGGLFNMNQLLANE I KLLVD 


6550 


2293 


922 


FRVSRDGAPDCX3IEQMGIAMEHGGSYARAGGSSRGCWYY1jRYFF 
LFVS L I QFL 1 1 LGLVLFM\TYGNVHVS TESNLQATERRAEGLYSQ 
LLGLTASQSNLTKELNFTTRAKDAIMQMWLNARRDLDR inas fr 
QCQGDRVI YTNNQR YMAAI I LSE KQCRDQFKDMNKS CDALL FPU* 
NQKVKTLEVE I AKEKT I CTKDKES VLLNKRVAEEQL. VECVKTRE 
I^HQERQrAKEQLQKVOALCL.PLDKDKFEMDL.RNLWRDS 1 1 PRS 
LDNI^YNLYHPLGSEiiASIRFsACDHMPSU^SKVEEIARSLRAD 
IER VARENSDIiQRQKIiEAQQGLiRAS QEAKQKVE K BAQAREAKLQ 
AECSRQTQIiA^EEKAVLRKERDNIiAKELEEKKREAEQLRMElAI 
RNSALDTCIKTKSQPMMPVSRPMGPVPNPQPIDPASLEEFKRKI 
LESQRP PAGI PVAPSSG 


6551 


157 


748 


IQPPDPRNMTIAAYKE KMKELPLVSbFCSCFLADPLNKS S YKYE 
ADTVDIiNWCVI SDMEVIELNKCTSGQS FEVILKPPS FDGVPEFN 
ASIfPRRRDPSLEE IQKKLEAAEERRKYOEAEIiLKHIiAEKREHER 
EVIQKAIEENNNFIKMAKEKIAQKMESNKF^EAHLAAMLERLQ 

EKDKHAEEVRKNKELKEEASR 


6552 


157 


748 


IQPPDPRNMTLAAYKEKMKELPLVSLFCSCFLADPLNKSSYKYE 
ADTVDLNWCVI SDMEVIBLNKCTSGQSFEVILKP PSFDGVPEFN 
ASLPRRJIDPSLEEIQKKLEAAEERRKYQEAELLKHLAEKREHER 
EVTQKAIEENNNFI KMAKEKIAQK^lESRKENREAHLAA^^UJERIlQ 
EKDKHAEEVRKNKELKEEASR . 


6553 

• 


2 


1807 


PVWSKMAAHLSYGRVNLNVIjREAVKREIaREFLDKCAG 
E YIjTG P FGLIAQ YS IjLKEHEVE KMFTLKGNRLPAADVKNI I FFV 
RPRLELMDI IAENVLSEDRRGPTRDFHILFVPRRSLLCEQRLKD 
IASVLGSFIHREEYSU3LIPF13GDIiIjSMESEGAFKECYX,EGDQTS 
LYHAAKGLMTLQALYGTIPQI FGKGE CARQVANMM IRMKREFTG 
SQNS I FPVFDNLLIiLDRNVDLLTPLATQLTYEGIjIDEI YGIQNS 
YVKLPPEKFAPKKQGDGGKDLPTEAKKLQIjNSAEEIjYAEIRDKN 
FNAVGS VLSKKAKI I SAAFEERHNAKTVGE I KQFVSQLPHMQAA 
RGSIANHTSIAELIKI)VTTSEDFFDKLTVEQEFMSGIDTDKVNN 
YIEDCIAQKHSLIKVI^LVCI^SVCNSGIJCQKVI^YYTa^ILOT 
YGYEHI LTLHNLE KAGLLKPQTGGRNNYPT IRKTI»RI,WMDDVNE 
QNPTDI S YVYSGYAPLS VRLAQIjI*SR PGWRS I EEVLR I I*PGPHF 
EERQPIi PTGLQKKRQ PGENRVTL I FFI/GGWFAEXAAIjRFLSQI* 
EDGGTE YVIATTKLMNGTSWI EALMEKPF 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C= Cysteine , D=Aspartic Acid, E= 
Glutamic Acid, P=Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K-Lysine, 
L>= Leucine , M=Methionine , N=Asparagine, 
P=Proline, Q=KSlutamine , R=Arginxne, 
S= Serine . T=Tnreonine . V=;Val ine * 
W=Tryptophan, Y=Tyrosine, X= Unknown, *~Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 


6554 


119 


1244 


FEMGSQVS VESGALHVVI VGGGFGG IAAASQLQAJuNV P FMLVDM 
KD S FHHNVAALRAS VETG FAKKT F I S YSVT FKDNFRQG L WG I D 
LKNQMVIjLQGGEALPFSHLILATGSTGPFPGKFNEVSSQQAAIQ 
AYEDMVRQVQRSRFI VVVGGGSAGVEMAAE I KTEYPBKEVTLIH 

EYREYTKVQTDKGTE VATNLVTIiCTG I KINSSAYRKAFE S R LAS 

SGAIJIVNEIILQVEGHSNVYAIGDCADVRTPKK^^ 

AN I VNS VKQR PIiQAYKPGALTFIiLS MGRNDGVGQ I SGFYVGRLM 

VRLTKSRDLFVSTSWKTMRQSPP 


6555 


1552 


498 


IHMALiLtRKINQVLLFLLI VTLC V I L YKKVHKGT VPKNDADDES E 
TPEBLEEE I PWICAAAGRMGATMAAINSIYSNTDANILFYWG 
LRNTLTRIRXW I EHS KLRE INFKIVE FNPMGL.KG KIRPDS S RPE 
LLQPLNFVRFYLPLLIHQHEKVIYLDDDVIVQGDIQEIjY1>TTLA 
LGHAAAFS DDCDLPSAQD INRLVGLQNTYMGYLD YRKKAI KDLG 
ISPoTCSrNPGVIVAI*MTEWKHQRITKQIjE 

SLGGGVATSPMLIVFHGKYSTINPLWHIRHLGWNPDARYSEHFL 
QEAKLLHWNGRHKPWDFPSVHNDLWESWFVPDPAGIFKLNHHS 


65S6 


241 


1449 


ASLCKGCFFVTHVliVIILPSLQSPPTFGFI.LDIDGVLVRGHRVI 
PAALKAFPJ^LWSCGQLRVPVVFVTNAGNILQHS KAQELSALLG 
CFAT3ADQVIIjSHSPMKLFSEY>IEKRMLVSGO^PVMENAQGIjGFR 
NWIVDELRMAFPLLDMVDIiERRLKTTPIjPRNDFPR 

epvrwots^limdvllsngspgaglatppyphlpvlasn^ 
wmaeakmprfghgtfllcletiyqkvtgk^ ilt 

X O XAfiDLX KKUAERRGWAA1' I R Kjj YA VCalJNPMSDVYGAN JjFHQ Y 

lqkathdgapexgaggtro^^psasqscisilvctgvynprnpq 
stepvlgggepp fhghrdlcfs pglmeashwndvneavqlvfr 

KEGWALE 


6557 


2598 


1534 


RMCGRTSCHLPRDVLTRACAYQDRRGQQRLPEWRDPDKYCPSYN 
KSPQSNSPVLLSRIiHFEKDADSSERI IAPMRWGLVPSWFKESDP 
SKI£FNTTNCR£DTVMEKRSFKVPIX3KGRRCVV^ 
QGTNQRQPYFI YFPQI KTEKSGSIGAADS PENWEKVWDNWRLLT 
MAGIFDCWEPPEGGDVLYSYTI ITVDSCKGLSDIHHRMPAII*DG 

PECLAPVBLVVKKEliRASGSSQRMIiQWlATKSPKKEDSKTPQKE 
BSDVPQWSSQFLQKSPLPTKRGTAGLLEQWLiCREKEEEPVAKRP 
YSO 


6558 


21 


1138 


FHGRRRGGRKMEIiGSCLEGGREAAEEEGEPEVKKRRJjliCVEFAS 
VASCDAAVAQC^I^NDWEMERAIiNSYFEPPVEESALERRPETI 
SEPiCTYVDLTNEETTDSTTSKISPSEiyrQQEircSMFSI*ITWNID 
GLDLNNLSERARGVCSYLALYS PDVIFLQEVIPPYYS YLKKRSS 
NYEIITGHEEGYFTAIMIJaCSRVKIiKSQEI I PFPSTKMMRNIiI»C 
VHVNVSGNELCIiMTSHLESTRGHAAEllMNQLKM\^KKMQEAPES 
ATVIFAGDTNLRDREVTR(^GLPNNIVT)VWEFIiGKPKHCQYTWD 
TQMNSNLG I TAACKLR FDR I FFRAAAE EGH 1 1 PR SLDLLGLEKL 
DCGRFPSDHWGLLCNLDIIIi 


6559 


3 


364 


GPELSGLPTRPKKLKANO/TPIAMDCCASRSC^VPTGPATTICSS 
DKSCKCGVCLPSTCPHTVKLXEPTCCDNCPPPCHIPQPCVPTCF 
LLNSCQPTPGLETLNI.TTFTQPCCEPCXPRGC 


6560 


3 


143 5 


TATSGGIWLRRKWRC^WPRPLPQSCVGTEGGLQVTU)TSSRIAKG 
GVDHTKMSLHGASGGHERSRDRRRSS DRSRDSSHERTES QLTP C 
IRNVTS PTRQHHVEREKDHSS SR PSS PRPQKAS PNGSIS SAGNS 
S RNSS QS S S DGS CKTAGEMVFVYENAKEGARNIRTS ERVTLI VD 
NTRFWDPS I FTAQPimilXSRMFGSGREHNFTRPNEKGEYEVAE 
GI GSTVPRAI LDYYKTG I IRCPDG I S I PELREACDYLC I S FEYS 
TIKCRDLSAIiMHEI^NDGARRQFEFYIiEEMILPLMVASAQSGER 
ECHIWI.TDDDWDWDEEYPPQMGEE YSQ1 1 YSTKLYRFFKY IE 
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SEO 
XD 
NO: 


beginning 
nucleotide 
location 
corre spondi ng 
to first 
amino acid 
residue of 
amino acid 
sequence 


nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


(A= Alanine , C=Cyst^ine, D-Aspartic Acid, £= 
Glutamic Acid, F=Pnenylolanins, G=Glycine, 
H=Histidine, I-Isoleucine, K=I*ysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P-Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Val ine , 
W=Tryptophan , Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=DCSsible nucleotide insertion) 








NRDVAKSVLKERG LKK I RLG I EG YPTYKEKVKKR PG GRPBV I YN 
YVQRPFIRMSWEKEEGKSRHVDFQCVKS KSITNLAAAAADI PQD 
QLWMHPTPQVDETLDILPIHPPSGNSDLDPDAQNPML 


6561 


3 


1086 

* 


PGRRFRRKESSS5RWFPADCLLGLRGPASSU-SPEPSPSWPSHS 
PCPMAALTDLSFM YRW FKNCNLVGNLS E KYVFITG CD SG FGNl*t» 
AKQLVDRGMQVTiAACFTEEGSQKMRDTSYRLG/TTLLDVTKSES 
I KAAAQV7VRDKVGEQGLWALVmAGVGLPSGPNEV7I,TKX)DFVKV 
INVl^VGLI 2 VTLHMbPMVKRARGRVVNMS S SGGRVAVIGGGY C 
VSKFGVEAFSDSIRRELYYFGVKVCIIEPGNYRTA1IX3KENLES 
RMRKLWERIjPQETRDS YGEDYFRI YTDKIiKNTMQVAE PRVRDVI 
NSMEHAI VSRS PRI RYNPGUDAKLLYI PIAKLPT PVTDFXLSRY 
liFRPADbV 


6562 


1 


1562 

* 


MS TL YD I RAHKAQ LiLR F FAS S DSNKALEQRRTLHTP KL.EHLDRV 
L Y EWFLG KRS EG VP VS G PML I EKAKDFYE QHQLTE P CV FSGGVTL 
WR FKARHG I KKLDAS S E KQSADHQ AAEQ FCAFFRS LAAE HGLS A 
EQVYNADETGLFWRCLPNPTPEGGAVPGPKCGKDRLTVLMCANA 
TG S HRJLKPIAI GKCS GPRAFKG I QHDPVAYKAQGNAW VDKE I FS 
DWFHHI FVPSVREHFRTIGLPEDS KAVI*L1»DS SRAHPQEAELVS 
SNVFTI FL P AS VAS LVQPMEQG IRRDFMRNF INPP VPLQGPHAR 
YNMNDAI FS VACAWNAVP SHVFRRAWR KLW P S VAF AEG S S SEE E 
LEAECFP VKPHNKS FAH I LELVKEG SSCPGQLRQRQAAS WGVAG 
REAEGGRPPAATS PAEWWSSEKTPKADQDGRGDPGEGEEVAWE 
QAA V AFDAVTiRFAERQ P C FS AQ EVG QLRAIjRAVFRS QQQVRRRR 
GAIiGAWKVEAlfQEGPGGCGATAQS PLPCSSTAGDN 


6563 


1319 


2694 


IiARPAQP VLIiRE PEGAG P P VFAGHLVHHLQGGHIiRERAHP DLEA 
HEHPI»PCDQMFWRQMGGH1»RMVEANSRGVVWG I G YDHTAWVYTG 
GYGGGCFQGLASSTSNI YTQSDVKCVHI YENQRWNPVTGYTSRG 
LPTDRYIWSDASGLQECTKAGTKPPSLQWAV7VSDWFVDFSVPGG 
TDQEGWQYASDFPAS YHG S KTMKD FVRRRCWARKC3CLVTSGPWI* 
EVP P IALRDVS 1 1 PESPGAEGSGHS I ALWAVS DKGDVLiCRLGVS 
E L»N PAGSS WliHVG JL JJQP FAS X S I GACYQVWAVAKIJG SAFlRGSV 
YPSQPAGDCWYHI PSPPRQRLKQVSAGQTSVYALDENGNL.WYRQ 
G I TPS YPQGSS WZHVSNNVCR VS VGPLDQ VWVI ANK VQGSHSLS 
RGTVCHRTGVQPHEPKGHGWDYGIGGGWDH I SVRANATRAPRSS 


6564 


1 


975 


APGS CALWSYCGRG WSRAMRGCQIJC*GIJ^SWPGDLLSARlal*SQE 
KRAAETHFGFETVSEEEKGG KVYQ VFESVAKKYDVMNDMMS LG I 
HRVWKDLliWKhfflPbPGTQIJ^VAGGTGDIAFRF'LtmrQSQHQR 
KQKRQLRAQONI»S WEE IAKE YQNEEDSLiGGS RWVCD INKEMLK 
VGKQKAIAQGYRAGLAWVLGDAEELPFDDDKFDI YTIAFG I RNV 
TH I DQALQEAHR VI*KPGGRFLCIjE FSQVNNPL I S RLYDLYS FQ V 
I FVLGEVIAGDWKS YQYLVES IRRFPSQEEFKDMI EDAGFHKVT 
YESLTSGIVAIHSGFiO* 


6565 


1464 


999 


RSAVANGLTKRRMGLKLNGR YISI.I LAVQIAYLVQAVRAAGKCD 
AVFKGFSDCLLKUSDSMANYPOGLDDKTNIKTVCTYWEDFHSCT 
VTALTDCOEGAKDMWDKLRKE S KNLN I QGSLFELCGSGNGAAGS 
LLPAFPVI.LVSLS AAIATWLS F 


6566 


3 

• 


1385 


KYESAQPGGTQPEPGI^ARMAIHKAI>VriCLGLPLFL>FPGAWAQG 
HVP PG CS QG LN P I* YYNLCDRS GAWG I VLEAVAGAG I VTTFVLT I 
I LVASLP FVQDT KKRSLI^TQVFTIJ^TLGI>FCIiVFACVE KPDF 
STCASRRFLFGVLFAICFSC^AAHVFALNFLARKNHGPRGWVTF 
TVAIiLLTLVEVI IWTEWIiI IT1»VRGSGEGGPQGNS SAGWAVAS P 
CAIANMDFVMAL I YVMr*LIiIK3AFDGAWPALCGRYKRWRKHG VFV 
LLTEATS VAI WWW I VM YTYGNKQHNS PTWDDPTliAI ALAANAW 
AFVLFYVI PE VSQ VTKS SPEQS YOGDMYPTRGVG YET I LKE QKG 
QSMFVENKAFSMDEPVAAKRPVS P YSGYNGQLXTS VYQPTBMAL 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=« Alanine , C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K^Lysine, 
L=Leucine, M=Methionine , N^Asparagine , 
P= Proline, Q=Glut amine , R=Arginine, 
S= Serine, T= Threonine , V=valine, 
W = Tryptophan , Y= Tyrosine , X-Unknown, *~Stop 
Codon, /=possible nucleotide deletion, 
\s=possible nucleotide insertion) 








MHKVPSEGAYDI ILPRATANSQVMGSANSTLRAEDMYSAQSHQA 
AT PP KDGKNSQVFRN P YVWD 


6567 


125 


863 


TKRSNLKAYACS IHHIRTMS YVFVNDSSOTNVPLLQACIDGDFN 
YSKRLLESGFDPNIRDSRGRTGMIJUUu^GirVTJlCQULHKFGAD 
LLATD YQGNTALHLCGHVDT I Q PL VS NGLK I D I CNHQGATPLVL 
AKRRGVNKDVIRI^SLEEQBVKGFNRGTllSKIiETMOTAESESA 
ME SHS LLiN PNLQQGEG VLSS FRTTWQS FVEDLG FWRVL LL I FV I 
ALLSLGIAYYVSGVLPFVENQPELVH 


6558 


3 


1133 


HASDRLLVLPDNYSHFSQASANLTOPSRTTELFHPTLASISSPM 
LEGAELYPNVDHGYliEGLVRGCKASLIjTG^DYINLVQCETLEDL 
KIHI^TTDYGNFIANHTNPLTVSKIDTE^KRLCGEraYFRN^ 
I^PLSTFLTYMTCSYWIDNVIIJJ^GA^ 

RFTEMEAVNIAET^SDLFNAIIjIETPLAPPPQDCMSEKALDELN 
IELLRNKLYKS YLEAFYTCFCKNHGDVTAEVMC P ILEFEADRRAF ; 
1 1 TLNS FGTELS KEDRETL Y PTFG KLY PEGLRLLAQAEDFDQMK 
NVADHYGVYKPLFEAVGGSGGKTLE DVFYEREVQMNVLAFNRQF 
HYGVFYAYVKLKEQEIRNI VWIAECI SQRHRTKIN3YI P I L 


6569 


205 


1532 


RRORGPQRLGHGRPTPLLCRWRTAGPSHWEKQARAFQGLRPVDPR 
RMS WL FPLTKSAS S S AAGS PGGLTSLQQQKQRLIESLRNSHSSI 
AE1QKDVEYRLPFTINNLTININILLPPQFPQEKPVISVYPPIR 
HHLMDKQGVYVTS PLVNNFTMHSDLGKI IQSLLDEFWKNPP VLA 
PTSTAFPYLYSNPSGMSPYASQGFPFLPPYPPQEANRSITSLSV 

ADTVS S STTS HTTAKPAAP S ?G VLS ML. P LPI PTVDAS IPTS QNG 
FGYKMPDVPDAFPEl*SEIiSVSQL>TDMNEQEEVIiuEQFLTLPQLK 
QIITDKDDLVKS IEELARKNLLLEPSLFJUCRQTVU3KYELLTQM 
KSTFEKKMQRQHELSESCSASAIiQARLKVAAHEAEEESDNIAED 
PLEGKMEIDDFLSSFMEKRTICHCRRAKEEKLQQAIAMHSQFHA 

PL 


6570 


330 


1304 


ARLPRLTFLREGFLYVLLSHWVFVGAPRPPASDSWKKGIjVPSAP 
PASRKMG S KALPAP I PLH PS LQI/TNYS FLQAVNTFPATVDHLQG 
L YGLS AVQT^IHMtlHWTljG YPNVHE ITRST I TEMAAAQGLVDAR? 
PFPALPFTTHLFHPKG^jAIAHVLPALH KDRPRFDFANLAVAATQ 
EDP P KMGDLSKLS PGLGS P I SGLS KLT PDRKPSRGRLPSKTKKE 
FI CKFCGRHFTKS YNLLIHERTHTDERPYTCDI CHKAFRRQDHL 
RDHRYIK^KEKPFXCQECGKGFCQSRTLAVHKTLHMQTSSPTAA 

SS AAKCSGETVI CGGT 


6571 


169 


656 


APDMNRKKLQKLTDTLTKKCKHLFRG FDKDNDGCVNVLEWIHGL 
SLFTjRGSLEEKMKYCFEVFDLNGDGF I SKEEMFHMLKNSLLKQP 
S E EDPDEG I KDLVE I TLKKMDHDHDG KLS FADYELA VREETLLL 
EAFGP CL P DPKSQME FEAQV F1CDPNE FNBM ] 


6572 


49 

♦ 


1646 


TPERAQPG ALLGAAG CCVCGGRW W VKb HJSKLJ X *'£>5/UU^toKKkW 
I^CSERHQKLVDBNYCiaCLHVQALKlITOSQIRNQMVQNENDNRV 
QRKQFUUsI^NEQFEX^MEEAIQKAEENKRLKEI^IiKQEEKI^ 
c»t n vt virpcT imT?vrvTT?r>TiVRPN^TEIjRELEKKLKAAYMNKERAA 
Q I AEKDAI KYEQM KRDAE I AKTMME EHKR 1 1 KEENAAEDKRNKA 
KAQYYLDLEKQLEEQEKKKQEAYEQLLKEKLMIDEIVRKIYEED | 
QLEKQQKLEKMNAMRRYI EEFQKEQALWRKKKREEMEEENRKII 
E FANMQQQREEDRMAKVQENE EKRLQLQNALTQKLE EMLRQRED 
LEQVRQELYQEEQAE I YKSKLKEEAEKKLRXQKEMKQDFEEQMA 
LKELVLQAAKEEEENFRKTMLAKFAEDDR I E LMNAQ KQRMKQLE 
HRRAVEKLIEERRQQFLADKQRELEEWQLQORROGFINAI IEEE 
RLKLIJCEHATNLLGYLPKGVFKKEDDIDLLGEEFRKVYQQRSEI 

CEEK 


6573 


767 


275 


GGGGGESQS FRAOJDGTRTPATDCLMYLQGPRKIJ4TOGGYDMVQK 
LFLDFFRRRLSQRPTAEELEQRNILKPR^JEQEEQEEKREIKRRL 
TRKLSQRPTVEELRERKILIRFSDYVEVADAQDYDRRADKPWTR 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C«Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, Is=lsoleucine, K=Lysine, 
L= Leu cine, M=Methionine, N=Aspaxagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T^Threonine, VWValine, 
W=Trypt ophan , Y=Tyrosine, X -Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








LTAADKVSRGECWRVGGRTVCWVSLiGSPLGSV 


6574 


204 


1159 


LESSVPVSVGVFWACX5VSWTGAAGL<}DGALSDTMARNAEKAMTA 
LARFRQAQIxEEGKVKERRPFIiASECTELPKAEKWRRQ I IGEIS K 
KVAQIQNAGIJ3EFT?IRDUaDEINKLLREKGHWEVRI KELGGPDY 
GKVGPKMLDHEGKEVTGNRGYKYFX3AAKDLPGVRELFEKEPLPP 
PRKTPJ\ELMKAIDFEYYGYIJ3EDDGVIVPLEQEYEKKLRAELVE 
KWKABREARLARGEKEEEEEEEEE IN I YAVTEEESDEEGSQEKG 
GDDSQQKF IAHVPVPSQQEIEEAIjVRRKKMELLQKYASETLQAQ 
SEEARRLLGY 


6575 


117 


820 


S PALASQS GG ITEEKMLE PQENG VI DLPD YEHVEDET F P P FPP P 
AS PERQDGEGTEPDEESGNGAPVPVPPKRTVKRNI PKLDAQRL I 
S ERGLPALRHVFDKAKFXGKGHEAEDLKMLI RHMEHWAHRLFPK 
I^FEDFIDRVEYIjGSKKEVQTGLKKIRIjDLPIIiHEDFVSNNDEV. 
AENNEHDVTSTELDPF1*TNIjSESEMFASEIjSISL»TEEQQQRIER 
NKQLALERRQAKLP 


6576 


1 


1060 


PEPQALVGQKRGALRLLVARI.VLTVSAPAEVRRRVLRPVL.SWMD 
RCTRALADSHFRGI^VDVPGVGQAPGRVAFVSEPGAFSYADFVR 
GFLLPNLPCVFSSAFTQGWGSRRRWVTPAGRPDFDHLLRTYGDV 
WPVANCGVQEYNSNPKEHMTLRDYITYWKE YIQAGYSS PRGCL 
YL KDWHLCR D FP VED VFTLP VYFS S DVfLNE F W DALD VDD YR FVY 
AGPAGSWSPFHADIFRS FS WS VNVCGRKKWLLi FP PGQEEALRDR 
HGNLPYDVTS PALCDTHLHPRNQLAGPPLEITQEAGEMVFVPSG 
WHHQVHNLVMCCFSCPLSGAFLQEDGSTTSPLSQPEXGWNGVAH 
G 


6577 


2271 


9 87 


SDRMASDDFDIVIEAKLEAPYKKEEDEQQRKEVKKDYPSNTTSS 
TSNSGNETSGSSTIGBTSNRSRDRDRYRRRNSRSRSPGRQCRHR 
SRSWDRRHGSESRSRDHRREDRVHYRSPPLATGYRYGHS KS PHF 
REKS PVRE PVDNLSPEERDARTVFCMQLAAR I RPRDU5DFFS AV 
GKVRDVRI ISDRNSRRS KGIAYVEFCEIQSVPLAIGLTGQRLLG 
VPI I VC^OJ^EKNRIAAMAimLQKGNGGPMRLWGSLHFNITED 
MLRGI FEPFGKI DNrVLMKDSDTGRS KGYGFITFSDSECARRAL 
EQLNG FELiAGRPMRVGHVTERLDGGTDI TFPDGDQELDLGS AGG 
RFQLMAKLAEGAG I QLPS TAAAAAAAAAAQAAALQLNGAVPLGA 
LNPAALTALSPALNLASQCLQLSSLFTPQTM 


6578 


377 


1489 


PSSSATMNRAPLKRATI LHMALTGAS DPS AEAEANGEKP FLLRA 
LQ 1ALWS L YW VTS I SMVFLN KYLLDS PSLRLDTPI FVTFYQCL 
VTTLLCKGLSALAACCPGAVDFPSLRIjDLRVARSVLPUSVVFIG 
MITFNNLCLKYVGVAPYNVGRSLTTVF^LLSYLLI^G/TTS 
LLTCGII IGGFV7I^VDQEGAEGTLSWI^TVFGVLASLCVSLNAI 
YTTKVLPAVDGS I WRIiTFYNNVNACILFLPLLLLLGELQALRDP 
AQLGSAHFWGMMTLGGLFGFAIGYVTGLQIKFTS PLTHNVSGTA 
KA(^QTVIAVLYYEETKSFLWWTSIWMViy3GSSAYtVAmGWEMK 
KTPEEPS PKDSEKSAMGV 


6579 


2 


711 


RPPRVWYPELRELSAAAPRVJSHRTAPG IMVFY FTSSS VNSSAYT 
IYMGKDKYENEDLI KHGWPED IWFHVDKLSSAHVYLRLHKGEN I 
EDI PKEVLMDCAHLVKANS IQGCKMNNVNVVYTP WSNLKKTADM 
DVGQ I G FHRQKDVKI VTVEKKVl^I LNRLEKTKVERFPDLAAEK 
ECRDREERNEKKAQIQEMKKREKEEMKKKREMDELRSYSSLMKV 

ENMSSNQDGNDSDEFM 


6580 


62 


1571 


LVALKNWKPKGTNI PAPQSPVFGEAVSGVYMMTKVLGMAPVLGP 
RPPQEQVGPLMVKVSEKEEKGKYLPSI^MFRQRFRQFGYHDTPG 
PREALSQLRVT*CCEWL»RPBIHTKEQII*ELI»VLEQFI»TIIiPQEIjQ 
AWVQEHCPESAEEAVTIXEDLEREI^EPGHQVSTPPNEQKPVWE 
KI S SSGTAKES PSS MQPQPI»ETSHKYES WGPLYIQE SGEEQEFA 
QDPRKVRDCRLSTQHEES ADEQKGSEAEGLKGDI ISVT IANKPE 
ASLERQCVNLENEKGTKPPI^EAGSKKGRESVPTKPTPGERRYI 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
conre sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=» Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acad, F= Phenylalanine, G-Glycme , 
H=Histidine , I=Isoleucine , K=Lysine , 
L= Leu cine, M=Methaonme, N=Asparagine, 
P= Proline , Q-Glutamine, R=Arginine, 
SsSerine, T=Threonine, V=Valine, 
W:=T tryptophan, Ys=Tyrosine, X=> Unknown, *-Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








CAECGKAPSNSSNLTKHRRTHTGEKPyVClKCXjKAFSHSSKLTI* 
HYRTHLVDRP YDCKCG KAFGQS SDLLKHQRMHTEEAP YQCKD CG 
KAFSGKGSLIRH YR 1 HTGEKP YQCNE CGKS FSQHAGLS S HQRLH 
TGEKPYKCKECGKAFNHSSNFNKHHRIHTGEKPYWCHHCGKTFC 
SKSNLSKHQRVHTGEGEAP 


6581 


228 


476 


RVF^KDI*SSTPMASNNTASIAQARKLVEQLKMEANIDRIKVSKA 
AADLMAYCEAHAXEDPLLTPVPAS ENPFREKKFFCAI L 


6582 


1428 


718 


CFTTKTHCS PVSVPYI>SPLVLR KEL.ES LLENEGDQVIHTSSF2N 
QHPI I FWTLVW YFRRLDLPSNLPGLILTS EHCNEGVQLPLSSLS 
QDSKLVYIQLLWDNIMLHQEPREPLYVSWRNFNSEKKSSLLSEE 

QQETSTLVET I RQS IQHNNVLKP INLLSQQMKPGMKRQRS LYRE 
ILFLSLVSIX5RENIDIEAFDNEYGIAYNSLSSEILERliQKIDAP 

PSASVEWCRKCFGAPLI 


6583 


487 


41 


RI FSMTSGRl»RWRCTWRPATAI>WSASIiRLGTSSMHPS PRSISLP 
LSMMI^PLPSNTRGIiSPTALFRSPDSEHATSCPRLHLWRCRAPL 
RSPSPLGRLQVLPRS P LHVHTHN SG KEVLGL.Q VQRS RS G TG PAC 
SQAGSGAVQGGNWCI F 


6584 


189 


17S0 


PI»PMAAIX3PSSQNVTEYVVRVPKin > TKKYNIMAFNAADKVNFAT 
WNQARLERDLSNKKIYQEEEMPESGAGSBFNRJCLREEARRKKYG 
IVLKEFRPEDQPWLLRVNGKSGRKFKGIKKGGVTENTSYYIFTQ 
CPDGAJTEAFPVHNVTYNFTPLARHRTLTAE 

fs imqqrrlkdqdqdedeeeke krgrrkas elr i hdledd lems 
sdasdasgeeggrvpkaxkkaplaxggrkkkjckkgsddeafeds 
ddgdfegqevoymsdgssssqeepeskakapqqeegpkgvdeqs 
d£seeseeek^>peedkeeeeekkaptpqekkrrkdsseesdsse 
esdidseassaffmakkktppkrerkpsggssrgnsrpgtpsae 

GGSTSSTXJLAAASKLEQGKRVSEMPAAKRIjRIjDTGPQSLSGKST 
PQPPSGBCTTPNSGDVQVTEBAVRRYLTRKPMTTKDLIiKKFQTKK 
TGLSSEQTVNVIiAQILKRIJTPERKMINDKKHFSLKE 


6S85 


3 


1678 


GPIRNSR IDDFVGGDPRAEASCS VLHS KPHAMADSRDPASDQMQ 

HWKEQRAAQ kadvi .ttgagn p vgdklnvi tvg prg pllvqd wf 

TDEMAHFDRBRI PERWHAKGAGAFGYFEVTHDITKYSKAXVFE 
HIGKKTPIAWFSTVAGESGSADTVRDPRGFAVKFYTEDGNWDL 
VGNNTP I FF IRDP I LFPSFI HSQKRNPQTHLKDPDMVWDFWSLR 
PESLHQVSFXFSBRGIPrXSHRHMNGYGSHTFKLVNAJIGEAVYCK 
FHYKTDQGIKNLSVEDAARLiSQEDPDYGIRDLFNAIATGKYPSW 
TFYI Q VMTFNQ AET FP FN P FDLT KVW PHKDY PL I P VGKL VLNRN 
PVNYFAEVEQ IAFDPSNMPPGIEAS PDKMLQGRLFAYPDTHRHR 
LGPNYLHrPVWCPYRARVT^NYQRDGPMCMQDNQGGAPNYYPNSF 
GAPEQQPSALBHS I QYSGE VRRFNTANDDNVTQVRAFYVNVLNE 
EQRKRLCENIAGHLKDAQI FIQKKAVKNFTEVHPDYGSHIQALL 
DKYNAEKPKNAI HTFVQSGSHLAAREKANL 


6586 


32 


804 


PLPEQPA3STSTMPVSGTPAPNKKRKSSKLIMELTGGGQESSGL 
NLGlUvl £» VPKLJ VrJi*E»c.Jjc>i-Jj i WKooxU'lT r^l^i\\jnJ\v Cir^r jl X n f 
PVFSDS SMDHFQKFL PTVGGQLGT AGQG FS YSKSNGRGGSQAGG 
SGSAGQYGS DQQHHLGSGSGAGGTGGPAGQAGRGGAAGTAGVGE 
TGSGDQAGGEGKHITVF1CTYISPWERAMGVDPQQKMELGIDLLA 
YGAKAELPKYKSFNRTAMPYGGYEKASKRMTFQMPKV 


6587 


75 


111-7 


RRVPS LGKMPECVJDGEHDIETP YGLLHWIRGS PKGNRPAI LTY 
HDVGLNHKLCFNTFFNFEDMQE ITKHFWCHVDAPGQQVGASQF 
PQG YQ FP S M EQLAAML PSWQH FG F KYVI G I G VGAG A YVLAKFA 
LIFPDLVEGLVLVNIDPNGKGWIDWAATKLSGLTSTLPDTVLSH 
LFSQEELVNmELVQSYKQQIGWVWQANLQLPWNMYNSRRDLD 
Il^PGTVPNAKTLRCPVr^VVGDNAPAEI)GVVECNSKLDPlTTT 
FLKMADSGGLP0VTQPGKLTEAFKYFLOGMGYMPSASMTRIJVRS 
RTASLTSASSVTCSRPQACTHSE3SEGLGQVNHTMEVSC 
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ID 

NO: 


beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


ricu ti»>,cu ciiu 

nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


i-um.no aciu segment containing sxgnaj. peptide 
(A=Alanine, C=»Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine , K=Lysine, 
1>= Leu cine, M=Methionine, N^Asparagine, 
P= Proline, Q=Glutamine, R=Arginine, 
S=Serine , T= Threonine , v=Val ine , 
W= Tryptophan, Y= Tyros ine, X=Qnknown, *-Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 


6568 


137 


501 


LGLQAQLLELRTNNYQLSDEI*RKNGVELTS LRQKVAYLDKEFS K 
EFSKLCSQMEQLEQENCX3I*KEGAAGAGVAQAGP 


6589 


2 


1405 


RPWGSAMATFSRQEFFCK2LI^GCLLPTAQQGIJX3IWLLLAICI^A 
CRIJjWRIXSIjPS YXJQIASTVAGGFFSLYHPFQIiHMVVJ VVLLS LLC 
YLVLFLCRHSSHRGVFLS VTI LI YLIJ^GEMHMVDTVTV*HKMRGA 
QM I VAMKAVS LGFDLDRGEVGTVPSP VE FMG YL YFVGTI VFG P W 
I S FHS YLQAVQGRPLS CRWLQ KVARS I^ALALLCLVLSTCVGP YL 
FPYFIPLNGDRLLRNKKRKARGTMVRWIiRAYESAVSFlJFSNYFV 
G FLS EATATLAGAG FTEEKDHLE W DL»T V S KP LNVE L P R S MVEW 
TS WNLPMS YWIjNNYVFKNALRLGTFS AVLVTYAASALLHGFS FH 
IJ^VTJ^LAFITYVF^r^Paa^I^ILSACVLSKRCPPDCS 
LGLGVPJUjNLLFGALAIFHLAYLGSLFDVDVDDTTEE^^ 
TVHKWSELSWASHWVTFGCWIFYRLIG 


6590 


2177 


656 


VRAY3HVLS LLENVFTPM FCHRDE YFRQLLRGAES PTRNS KLNR 
GSIiSLDDFRNTQKRGESFGISRIGSKIKGVFKSTTMEGAMLPNY 
GVAEGEDDFIEEGI WMEDDS PVEAVSTPNTPRNLAAWKISIPY 
VDFFEDPSS ERKEKKERI PVFCIDVERNDRRAVGHEPEHWSVYR 
RYLEFYVLESKLTEFHGAFPDAQLPSKRI IGPKNYEFLJCSKREB 
FQEYLQKLLQHPELSNSQLLADFLS PNGGETQFLDKI LPDVNLG 
KIIKSVPGKLMKEKGQHLEPFIMNFINSCESPKPKPSRPELTIL 
SPTSENiraKLFNDLFKNNAlSn*AENTERKQNQNYFMEV^ 
DYLMYVGRVVFQVPDWIJlHIiLMGTRILFKNTLEMYTI3YY^ 
EQLFQEHRLVSLI TLLRDAI FCENTE PRSLQDKQKGAKQT FEEM 
MNYIPDLLVKCIGEETTKYESIRLLFDGLQQPVLNKQLTYVLLDI 
VI QELFPE LNKVQKEVTS VTS WM 


659X 


2177 


656 

• 


VRA YE HVLSLJjENVFTPMFGHRDEYFRQLLRGAES PTRNS KLNR 
GSLSLDDFRNTQKRGES FG ISR I GS KIKGVFKS TTMEGAMLPNY 
GVAEGEDDFIEEGI WMEDDS PVEAVSTPNTPRNLAAWKI S IPY 
VDFFEDPSSERKEKICER I PVFCIDVERNDRRAVGHEPEHWSVYR 
RYLEFYVLESKLTEFHGAFPDAQLPS KRI IGPKNYEFLKSKREE 
FQEYLQKLLQHPEIjSNS QLLADFLS PNGGETQFLDKI L PD VNLG 
KIIKSVPGKLMKEKGQHLEPFIMNFINSCESPKPKPSRPELTIL 
S PTS ENNKKLFNDLFKNNANRAENTEJEUCQNQNY FMEVMTVEG VY 
DYLMYVGRVVFQVPDWIJIHLLJ^TRILFKNTIiEMYTDYYI^CKL 
EQLFQEHRLVSLI TLLRDAI FCEN TE PRSLQDKQKGAKQTFEEM 
MNYIPDLLVKCIGEETKYESIRLLFDGIiQXJPVIiNKQLTYyLLDI 
VIQELFPELNKVQKEVTSVTSWM 


6592 


3 


1861 


APEFLGSTISSGSMIDANLKLLQEAEQRLKAIVAEKFAIATKEG 
DLPQVERFFKI FPLLGLHEEGLRKFSE YLCKQVASKAEENLLMV 

LYTL I KYLQVECDRQVEKVVDKFI KQRDYHQQFRHVQNNLMRNS 
TTBKI E PRELDP I LTEVTLMNARS EL YLRFLKKR I SSDFEVGDS 
MASEEVKQEHQKCLDKIiLNNCLLSCTMQELIGLYVTMEEYFM^ 
TVNKAVALDT YE KGQLTS SMVDDVFY I VKKCIGRALSSS S IDCL 
CAM INLATTELES DFRDVLCN KLRMGF PATTFQDIQRGVTS AVN 
IMHSSLOGX3KFDTKGIESTDEAJCMSFLVTLNNVEVCSENISTLK 
KTLES DCTKLFSQG IGGEQAQAKFDS CLSDLAAVSNKFRDLLQE 
GLTELN S TAI KPQVQPW INS FFSVSHNIEEEEFNDYEANDPWVQ 
QFILNLEQQMAEFKASLS PVI YDSLTGLMTSLVAVELEKWLKS 
TFNRLGGLQFDKELRSLI AYLTTVTTWTI R DKFARLSQMAT I LN 
LERVTE I LD YWGPNSGPLTWRLTPAEVRQVLALRI DFRSEDI KR 
LRL 


6593 


3 


1837 


EAFS AGS RRRGLALQRGVLGGLGG Y C P CCCRRRGRLL VL LLL VR 
RGGEGGGGRGRGDKRRRRQ ARRQRRRPE PAEARGGKMADVLS VL 
RQYNIQKXEIVVKGDEVIFGEFSWPKNVKTNYVVWGTGKEGQPR 
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SEQ 
ID | 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
correspond! ng 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A= Alanine, C=Cysteine, D=Aepartic Acid, E- 
Glutamic Acid, F=»Phenylalanine, G-Glycine, 
H~Histidine, I-Isoleucine, K-kysine, 
L»=Leucine, M=Methionine # N=Asparagine, 
P^Proline, Q=Glutaraine , R=Arginine, 
S=Serine, T=Threonine, V=valine, 
W^Tryptophan, Y=Tyrosine, X=UnJcnown # *=Stop 
Codon, / = possible nucleotide deletion, 
\=possibls nucleotide insertion) 




i 

* 




E YYTLDS I L FLLNNVHLSHP VYVRRAATEN I PWRRPDRKDLLG 
YLNGEASTSASIDRSAPLEIGLQRSTQVKRAADEVLAEAKKPR I 
EDEECVRLDKERLAARLEGHKEGIVQTEQIRSIjSE^ISVEKIAA 
I KAKIMAKKRSTI KTDLDDDI TAliKQRS FVDAEVD VTRD IVS RE 
RVWRTRTT I LQSTGXN FSKNI FAI LQS VKAREEGRAPEQRPAPN 
AAPVDPTIJiTKQPIPAAYNRYDQERFKGKEETEGFKIDTMGTYH 
GOTLKS VTEGASARKTQTPAAQP VPRP VSQARP PPNQKKGSRTP 
1 1 1 IPAATTSLITMLlCUa3LI^DLKFVPSDEKKKOGCQRENBTlj 
IQRRKDQMQPGGTAI SVTVPYRWDQPLKLMPQDWDRWAVFVQ 
GPAWQFKGWPWLLPDGSPVDI FAKIKAFHLKYDEVRLDPNVQKW 
DVTVTUEI*SYHKRHIiDRPVFLRVV^TLDRYMVKHKSHI»RF 


6594 


1 ' 


1096 


EFPGRRFRGSQA^ PLCATCGPAIjLRAPTRAAMTRSLFKGNFWSA 
DI LST1G YDNI IQHLNNGRKNCKEFEDFLKERAAIEERYGKDLL 
NLSRKKPCGOSEINTLKRALBVFKQQVDNVAQCHIQIAQSLREE 
ARKMEEPREXQKLQRKKTElIMDAIHKQKSIiQFKI^^ 
Q KCRDKDEAEQAVS RS ANL VN P KQQEKLFVKLiATS KT AVEDSDK 
AYMLHIGTUDKVREEWQSEHI KACEAFEAQECERINF FRNALWIj 
H VNQLSQQCVTSDEMYEQVRKSIiEMCS IQRD I E YFVNQRKTGQI 
PPAP IMYENFYSSQKNAVPAGKATG PNLARRGPLPI PKSS PDDP 
i NYSI/VDDYS I*I*YQ 


6595 


57 


781 


PIX»TMSDSDIX5EDEGLLSIAGKRKRRGNIiPKESVKIIJU>WIjYLH 
RYNAYPSEQEKI*SLSGOTNLSVLQI CNWFlNARRRLLPDMIiRKD 
GKDPNQFTI SRRGGKASDVALPRGSSPSVIAVSVPAPTNVLSLS 
VCSMPLHSGQGEKPAAPFPRGELESPKPLVTPGSTIiTIiliTRAEA 
GS PTGGLFNTPPPTPPEQDKEDFS SFQLLVEVALQRAAEMELQK 
QQDPSLPIiIjHTP I PLVSENPQ 


6596 


2 

• 


1026 


PRIiPVRRYHGRRRJjO^RSRGHMAEGDAGSDQRQNEEIEAMAAIY 
GEEWCVIDDCAKIFCIRISDDIDDPKWTIjCLQVKLPNEYPGTAP 
P I YQLNAP WI»KGQERADI»SNSLEE I YIQNIGES ILYXWVEKIRD 
VL I QKSQMTEPGPDVKKKTEEEDVECEDDL I IiACQPES S VKALD 
FBI SETRTEVE VEELP P I DHG I P I TDRRSTFQAKLAPWCPKQV 
KMVLSKIjYENKKIASATHNI YAYRI YCEDKQTFLQDCEDDGETA 
AGGRLLHI^ILNVKNVIWWSRWYGGIIMPDRFKHINNCARN 

I L VE KNYTNS P EESSKAIG KNKKVR KDKKRNEH 


6597 


2 


1026 


PRLP VRRYHGRRRIjQGRSRGHMAEGDAGSDQRQNEEIEAMAAI Y 
GEEWCVI DDCAKI FCIRISDDIDDPKl'JTLCLQVMLPNEYPGTAP 
PIYQLNAPWIiKGQERADI*SNSLEE I YIQNIGES I LYI*WVEKIRD 
VLIQKSQMTEPGPDVKKKTEEEDVECEDDLIIACQPESSVKALD 
FDISETRTEVEVE RT .PP IDHGIPITDRRSTFQAHIAPWCPKQV 
KMVI^KLYENKKIASATHNIYAYRIYCEDKQTFLQDCEDDGETA 
AGGRXlLHLiME I LNVKNVMVWS RWYGG I LLGPDRFXHINNCARN 
I LVEKNYTNSPEESS KAIX3KNKKVRKDKKRNEH 


6598 


1099 


419 


PRVRWATTMAMS FEWPWQYRFPPFFTLQPNVDTRQKQIAAWCSIi 

LRKKGNIjE WLDKS KS SFL I MWRRPEEWGKLI YQWVS RSGQNNS V 
FTLYELTNGEI)TEDEEFHGLDEATIJ^RAIiQALQQEHKAE 1 1 TVS 
DGPRRQVLLAGTCIiPLLLTSHLSRAFKRRQTCXrPPKTGSVTPPD 

SKGLQS 


6599 


164 


1593 


KMAALTTLF KY I DENQDRYI KKLAKWVAIQS VSAWPEKRGEIRR 
MMEVAAADVKQLGGSVELVDIGKQKLPDGSEIPLPPILI/SRIjGS 
DPQKKT VC I YGHLDVQPAALEDG WDS EP FTLVERDGKLHGRGST | 
DDKGPVAGWINAIiEAYQKTGQEIPVNVRFCTiEGMEESGSEGIiDE 
LI FARKIXT FFKDVD YVCISDNYWIiGKKKPCI TYGLRGI CYFF I E 
VE CSNKDLHSGVYGGS VHEAMTDL I LLMGS LVDKRGNII*I PGIN 
EAVAAVTE EEHKLYDD IDFDI EEFAKDVGAQ I ULHS HKKDH»MH 
RWR YPSLSLHGIEGAFSGSGAKTVI PRKWGKFSIRLVPNMTPE 
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SEQ 
ID 
NOr 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide j 
{A= Alanine, C= Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidinc, I=Isoleucine, K=I»ysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P= Proline, Q=Glucamine, R=Arginine, 
S=Serine. T=Threonine , V= Valine, 
W=Tryptophan, Y=Tyrosine, . X=Unknown, *=Stop 
coaon, /spossiDie nucleotide deletion, 
\=possible nucleotide insertion) 








VVGEQVTS YIjTKKFAELRS PNEFKVYMGHGGKPWVSDFSHPH Yli \ 
AGRRAMXTVFGVEPDLTREGGS I P VTI*TFQEATGKNVH1»L PVG S 
ADDGAHS QNE KLNRYNY IEGTKMLAAYI*YBVSQI»KD 


6600 


2 


934 


PGR1» FR VAAM ESAG LE Q LLRE LLLPDTER I RRAT EQ LQ I VLRAP 
AALSAIjCDLIAS AADPQ IRQFAAVLTRRRXNTRWRR S 
LKSIiUjTALQRETEHCVSLSLAQLSATI FRKEGLEAWPQLIjQLL 
QHSTHSPHSPEREMGI^LI»SVVVTSRPEAFQPHHREI*UUjIjN^ 
LGEVGS PGLLF YSLRTLTTMA P YI*STE1)VPIjARMI>VPK1j IMAMQ 
Till PIDEAKACEALEAIJ5EIjLESEV?VITPYI*SEVLTFCIjEVAR 
NVALGNAI RIRI LCCL.T FLVKVKS KALLKNR LLAT LAAH P F PHC 
GC . 


6601 


529 


1420 


PRAAARAPPPAVTjRRDRRAATAPGAGEMTLHG PLAQRYFLNH IE 
K I TTWQDP RKAMNQ P LNHMNtiH P AVS S TPV PQRSMAVS Q PNL VM 
NHQH0^MAPSTLSQQNHPTQNPPAGIiMSKPNAIiTTGOX30^KL 
RLQRIQMERERIRMRQEELMRQEAAIiCKQLPMEAETLAPVQAAV 
NPPTMTPDMRSITNNSSDPFTjNGGPYHSREQSTDSGIiGIiGCYSV 
PTTPEDFLSNVDEMDTGENAOT1TMNINPQQTRFPDFLDCLPGT 
NVDIjGTLE S EDL I PLFNDVES AIiNKSE PFLTWI* 


6602 


127 


617 


LLDFPALPKFVLAQSPKAGKPSTMTSMTQSLREVI KAKTKARNF 
ER VLGKI TLVS AAPGKVI CEMKVEBEHTNAI GTIiHGGIiTATLVD 
NI STt4AIXCTERGAPGVS VDMNI T YMS PAKLGEDI VI TAHVUCQ 
GKTIiAFTSVDIiTNKATGKLIAQGRHTKHLGN 


6603 


79 


660 


PVGPSSIiAARTGLGHIiPFLHRliASSRGLDMDI^ 
SGMGATGTLRTS LDPSLE I YKKMFBVKRREQIJ^AIjKNLAQLNDI 
HQQYKIIiDVMLKGLFTCVLEDSRTVLTAADVLPDGPFPQDEKLKD 
AFSHVVENTAFFGDVVLRFPRIVHYYFDHNSNWNIiI*IRWG ISFC 
NQTG VFNQGPHS ? ILSLM 


6604 


3 


688 


TSTAQRQGGERMS FRGGGRGGFNRGGGGGGFNRGGS SNHFRGGG 
GGGGGGNFRGGGRGGFGRGGGRGGFNKGQDQGPP ERWLLGEFL 
HP CE DD I VC KCTT D ENKVP YFNAP VYLENKEQ I G KVDE I FGQLR 
DFYFSVKLSENMKASSFKKIiQKFYIDPYKLLPIjQRFLPRPPGEK 
GPPRGGGRGGRGGGRGGGGRGGGRGGGFRGGRGGGGGGFRGGRG 
GGFRGRGH 


6605 


7 


B48 


SGSRRGAMRAAGVGLVDOTCra>SAPDFDRDI^DVI*EKAKKANVV 
ALVAVAEHSGEFEKI MQLSERYNGFVIjPCLGVHPVQGI»P PEDQR 
SVTLKDIjDVALPI IENYKDRbLAlGEVGLDFSPRFAGTGBQKEE 
QRQVL IRQIQIAKRIJ^PVNVHSRSAGRPTINIjLQEQGAEKVLIi 
HAFDGRPSVAMEGVRAGYFFS IPPSII RSGQQKI»VKQLPLTS I C 
LETDS PALGPEKQVRNEPWNI S ISABYIAQ VKG IS VEE VIE VTT 
QNALKL F PKXRHLLQK 


6606 


2 


1682 


FVEIRPRAE VANLSAHS AS P I QDAVL KRIiSDliED J. V YRQL»N(jL»£> 
KSLGLIEGYGGRGKGGLPATLSPAEEEKAKGPHEKYGYNSYLSE 

KI SLtDR S I PDYRPTKCKELKYS KDLPQ ISIIFI FVNEAbSVI LR 

7U c a^nowTDTUT t vx , TTT.^7T>nXfC'rvT7PPTiTrvrPIjEEYVHK3?YPC?Ij 
O V tii2^\v iSti.1. it i. riijljiSXj X J-J-i V LfL/rtC3lJC,CiZtJJl^V ruiicii vniu\iruu 

VKWRNQKREGLI RARI EGWKVATGQVTGFFDAHVEFTAGWAEP 
VLSRI QENRKRVILPS I DNI KQDNFEVQRYENSAHGYS WEXWCM 
YI SPPKDWWDAGDPS I»P IRTPAMIGCS FVVNRKFFGE IGLLDPG 
MDVYGGENIEIX3IKWLCGGSMEV1^PCSRVAHIERKKK?YNSNI 
GFYTKRNALRVAEVVIMDDYKSHVYIAWNLPLENPGIDIGDVSER 
RAI^KSLKCKNFQWYLDHVYPEMRRYNNTVAYGELRNNKAKDVC 
LDQGP LENHTAIL YP CHGJTCPQLARY^KEGFLHLGAIG 
DTRCLVDNS KSRLPQLLDCDKVKSS LYKRWNFI QNGAIMNKGTG 
RCLEVENRGLAGIDLILRSCTGQRWTIKNSIK 


6607 


137 


986 


VPACAGLkKEARSLIJ^PPRLI^TKI^ASCRAI^SPPlQSRCrrT 
GISFGGRGGAGPGVPTRTQVFAAMGAVMGTFSSIiQTKQRRPSKD 
KIEDEI»EMTMVCHRPEGLEQIjEAQTNFTKREIiQVIjYRGFKNECP 
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SEQ 
ID 
NO: 


Predicted 

beginning 

nucleotide 

location 

cor re sponding 

to first 

amino acid 

residue of 

amino acid 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanme, C= Cysteine , D=Aspartic Aciq, E=s 
Glutamic Acid, F= Phenylalanine , G=Glycine, 
H=Histidine, I=Isoleucine, K=l»ysine, 
L= Leucine , M=Methionine , N= Asparag ine , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine f 
W=Tryptophan, Y= Tyrosine , X»Unknown, *=Stop 
Codon, /»pos3ible nucieotxcte ceJ.ec ion, 
\=possible nucleotide insertion) 








SG WNEDT FKQ I YAQ FFPHGDAST YAHYTjFNAFDTTQTGS VKFE 
DFVTALS 1 1* LRGTVHE KLRWTFNI* YD I NKDG Y INQEEMMD IVKA 
IYDMMGKYTYPVuKEDTPRQHVjjvr r y tu>ruwMiu-«jJL v l liUc-ci-iH, 
SCQEDDN I MRSLQLFQNVM 


6608 


224 


1140 


RPCFSSPTGLCPRIjSYPMIIjLOHAVLPPPKOPSPSPPMSVATRS 
TGTLQL P PC KPFGQEASL PLAGE EELS KGGEQDCALEELCXPLY 
CKIX3T\rrLNSAQQAQAHYO^KNHGKKLRNYYAANS CPP PARMSN 
WBPAATPWPVPPCMGSFKPGGRVI IATENDYCKLCBASFSSP 
AVAQAHYQGKNHAKRLRLAEAQSNS FS ES SELGQRRARKEGNEF 
KMMPNRRNMYTVQNNSGPYFNPRSRQR I PRDLAMCVTFSGQFYC 
SMCNVGAGEEME FRQHLE SKQHKS KVS EQRYRNEMENLG YV 


6609 


1 


443 


FRLRCRRFRVAGGRLAGAGLRESRVPAPEQRLSALTLLSWSAVT 
PAAEPGNFQLS PAEPRGPLASPVRAAPRAPCPAAEMSELNTKTS 
P ATNQAAGQEE KG KAGNV KKAEE EE E I D I DLTAP ETEKAALAI Q 
GKFRRFQKRKKDPSS 


6610 


319 


881 


GRKSLCNLH1PIRFPLTYPDMYMGMMCTAKKCX5IRFQPPAIILI 
YESEI KGKIRQRIMPVRNFSKFSDCTRAAEQLKNNPRHKS YLEQ 
VSI^QLEKLFSFLRGYLSGQSLAETMEQIQRETTIDPEEDLNKL 
DDKElAKRKSIMDEIiFEKNQKKKDDPNFVYDlEVEFPQDDQLQS 
CGWDTESADBF 


6611 


978 


212 


' PGCSGAGSRWWLPALRHLAMGSTESSEGRRVSFGVDEEERVRV 
LQG VR LS ENWWRM KE P S S P P P APTS ST FGLQDGNLRAPHKES T 
LPRSGSSGGQQPSGMKEGVKRYEQEHAAIQDKLFOVAKREREAA 
TKHSKASLPTGEGSISHEEQKSVRI^ARELESREAELRRRDTFYK 
EQLERIERKNAEMYKLSSEQFHEAASKMESTIKPRRVEPVCSGL 
QAQILHCYRDRPHEVLLCSDLVKAYQRCVSAAHKG 


6612 


1724 


992 


VSTHASALSRTQGQPQRQPRAAASGAGAGTAGGGGSGGAEGSKM 
STEAQRVDDSPSTSGGSSDGDQRESVQQEPEREQVQPKKKEGKI 
SSKTAAKI*STSAXRIQKELAEITLDPPPNCSAGPKGDNIYEWRS 
TIIjGPPGSVY^GGVFTLDITFSPDYPFKPPKVTFRTRIYHCNIN 
SQGVICLDILKDNWSPALTI SK\7LLS I CSLLTDCNPADPLVGS I 
ATQYMTNRAEHDRMARQWTKRYAT 


6613 


130 


748 


ELELS SNM PEQSNDYRVAV FGAGGVGKS S LVLR FVKGTFRES Y I 
PTVEDTYRQ V 1 S CDKS ICTLQ I TDTTG SHQFP AMQRLS I SKGHA 
FILVYSITSRQS LEELKP IYEQICEI KGDV3S I P IMLVGNKCDE 
SPSRKVQSSEAEALARWKCAFMETSAKLNHNVKELFQELLNLE 

KRRTVS LQIDGKKS KQQKRKEKLKGKCVI M 


6614 


3 


1191 


S S AAEAMRVLVRRCWGP PIJ^GARRGRPS PQWRAXARLGWEDCR 
DSRVREK? PWRVLFFGTDQ FAREALRAIjHAARENKEEELIDKLE 
WTMPS PS PKGLPVKQYAVQS QLPVYEW PDVGSGE YDVGWAS F 

GVTIMQ IRPKRFDVGPILKQETVPVPPKSTAKELEAVLSRLGAN 
MLISVLKNLPESLSNGRQQPMEGATYAPKI SAGTSCI KWEEQTS 
EQIFRLYRAJGKIIPIiQ/TLWMANTIKLIiDLVEVNSSVIiADPKLT 
GQ ALI PGSVTYHKQSQILLVYCKDGMIGVRSVMLKKSLTATDFY 
NGYIJiPWYQKNSQAQPSQCRFQTU^PTKKKQKKTVAMQQCIE 


6615 


832 


35 


GRVGAGASAMSELPGDVRAFLREHPSLRLQTDARKVRCILTGHE 
LPCRLPEIXJVYTRGKKYQRLVRASPAFDYAEFEPHIVPSTKNPH 
QLFCKLTLRHINKCPEHVLRHTQGRRYQRALCKYEECQKQGVEY 
VPACL VHRRRRREDQMDGDG PR PREAFWE PTSSDEGGAAS DDS M 
TDLYPPELFTRKDIX5STEBGDGTDDFLTDKEDEKAKPPREKATD 
EGRRETTVYRGLVQKRGKXQLGSLKKKFKSHHRKPKSFSSCKQS 

G 


6616 


347 


1886 


I^PCO^RPLSSPPHASEDNLFLFWNCILCAFPHPSPQPLQYP 
WPLLLVITQIPAPRHLRNRPFSFSRGGLDSFSGSLSTPSICRS 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A= Alanine, C=Cysteine, D= As par tic Acid, E= 
Glutamic Acid, F=Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=I>ysine, 
L=Leucine, M=Methionine, N=Asparagine. 
P=Proline, Q=Glutaraine, R«Arginine, 
S=Serine, T*Threonine, V= Valine, 
W=Tryptophan; Y=Tvrosine, X=»Unknown. *=StoD 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








PAWVKMAPWPPKGIiVPAVLWGI*SbFI*NI*PGPI WLQPS PPPQSS P 
PPQPHPCHTCRGLVDSFNKGLERTIRDNPGGGNTAWEEENLS KY 
KDSETR1»VEVI»EGVCSKSDFECHRIJjBI^EEI*VESWWFHKQQEA 
PDLFQWLCSDSLKLCCPAGTFGPSCLPCPGGTERPCGGYGQCEG 
EGTRGGSGHCDCQAGYGGEACGQCGLGYFEAERNASHIjVCSACF 
G PCARCSG PEESNCI^QCKKGWALlHHIjKCVD I DECGTEG AN CGAD 
0FCVNTEGSYEC3U)CAKACIjGCMGAGPGRCKKCSPGYOOVGSKC 
IiDVDECETEVCPGEN KQ CENTEGG YRC I CAEG YKQMEG I CVKEQ 
I PESAGFFSEMTEDELWLQQMFFGI 1 1 CALATLAAKGDLVFTA 
IFIGAVAAMTGYWLSERSDRVLEGFIKGR 


6617 


118 


673 


VWMAWQVSLLELEDRJLQCP I CLEVFKESLMLQOGHS YCKGCIiVS 

KVCVHHRNPLSLFCEKIXiFJilCGIjCGTjI^SHQHHPVTPISTVCS 
RMKEEIJJUiFSEIiKQEQKKVDEL IAKIjVXNRTRI DG SAPSLCPC 
LGPATFTFL 


6618 


54 8 


136 


DG KVARRAPNS PAFQND I Y PI»VS APRATTAESPWS KVLQNTQCR 
NVPKMTSERSRI P CLS AAAAE GTG KKQQEG RAMATLDRKVP S P E 
AFIiGKPWSSWIDAAKliHCSDNVDLEEAGKEGGKSREVMRLNKEA 
WKYGT 


6619 


24 6 


842 


PASSEVLTAAVMFLLIjNCIVAVSQNMG I GKNGDIiPR P PIjRNEFR 
YFQRMTTTS S VEGKQNLV IMGRKTWFS I PKKNRPLKDRI NLVLS 
RELKEPPOXIAHFIUARSI^DALKLTERPFXANKVDMIWIVGGSSV 
YKEAMNHLGHLKLFVTRIMQDFESDTFFSEIDLEKYKLLPEYPG 
ILSDVQEGKHI KYKFE VCEKDD 


6620 


3 

• 


1879 


NSRVDDFVARARMAAENEASQESALGAYSPVDYMS I TS FPRLPE 
DE PAPAAP IiRGRKDEDAFLGD PDTDPDS FliKSARLQR LPS SS S E 

VTVALVMQIYFGDPQIFQQGAVVTDAARCTSIX5IEVLSKQGSSV 
DAAVAAAliCLGIVAPHS SGLGGGGVMLVHDIRRNESHL IDFRES 
APGAJjR EETLQRS WETKPGTXVGVPGMVKGLHEAHQL YGRXPWS 
QVliAFAAAVAQDG FNVTHDLARAIAEQLP PNMS ERFRETFliPSG 
RPPLPGSIjIjHR PDIiAEVLDVIiGTSGPAAFYAGGNIiTIjEMVAEAO 
HAGG VITEEDFSNYSALVEKPVCGVYRGHLVI»S PPPPHTGPAb I 
SALNILEGFNIjTSLVSREQAIiHWAETLKIAIj^^ 
ST ITESMDDMIiS KVEAAYLRGH INDSQAAPAPDLPVYEliDGAPT 
AAQVIjIMGPDDFIVAMVSSLNQPFGSGIiITPSGIIjIjNSQMLDFS 
WPNRTANHSAPSLENSVQPGKRPLSFLLPTVVRPAEGLCGTYIA 
LGANGAARGLSGLTQVRFTPWXAFFSREPS CGLDCRCL.S YLWLV 
SIPRAANMG 


6621 


1 j 


662 


VQGITSYG^RLQAI>RKEKSRDAARSRRGKENFEFYEIiAKLLPLP 
AAlTSQIiDKASIIRLTISYI>KKnU)FANO^DPPWNIjRMEGPPPNT 
SVKVIGAQRRRSPSAIaAIEWEAHLGSHILQSLDGYVFALNQEG 
KFLYISETVS IYIiGLSQVELTGSSVFDYVHPGDHVEMAEQIiGMK 
LPPGRGLLSQGTAEDGASSASSSSQSETPEPVVCFPPASDQFLIj 


6622 


2 


319 


GRASGAQEETEAGGPERARAMEANMPKRKEPGRSLRI kvi smgn 
AEVGKSCIIKRYCEKRFVSKYIiATIGIDYGVTKVHVRDREIKVN 

ifdmaghpffyevrkpf 


6623 


1886 


189 


KALFEKVKKFRLHVEEGDIIjYAM YVRQT VL KVI KFI»1 1 1 AYNSA 
LVSKVQFTVT>CNVDIQDMTGYKNF 

VS I YGLTCLY1X YWLF YRSLRB YS FEYVRQETGFDDI PDVKNDF 
AFMLHM IDQYDPLYS KRFAVFLSEVS ENKLKQLNLNNEWTPDKL 
RQKLQTNAHNRliEI.PLI WDSGLPDTVFEITELQSLKLE 1 1 KNVM 
I P AT IAQLDNLQEIjS IiHQCSVKIHSAALS FLKENUCVLS VKFDD 
MRELPPWMYGLRNLEEliYIjVGSI*SHD ISRNVTLESLRDLKBIjKI 
LS I KSNVS K r PQAWDVS SHLQKMd HNDGT KLVMLNNLKKMTN 
LTELEIiVHCDLERIPHAVFSLI*SIX3ELDIiKENNIjKS I EEIVSFQ 
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SEQ 
ID 

NO: 


Predicted 

nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C= Cysteine , D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G=sGlycine, 
H=Histidine, I-Isoleucine, K= Lysine, 
L=Leucine, K=Methionine, N=Asparagine, 
P=Proline, QsGlutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y= Tyrosine, X= Unknown , *s=Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








HJLRKIjTVIiKLWHNS I TYI PEHI KKI/TSTiF.R I tS FSHNKI EVL.PSH 
LFLCNKIRYLDLS YND IRFI PPE IGVLQSLQY FSITCNKVESLP 
DEL YFCKKLKTLKIGKNSLS VI*S PKIGNLLFLS YLDGKGNHFE I 
LPPEI^DCRALKRAGLVVEDALFETLPSDVREQMKTE 


6624 


218 


1786 


GSRRGGGSRI P AVS TH VAPGRS VLiRP FAS GALRLRS L VXALGG C 
RGRPSGLAHLSQETSHWRAXRSGRACLGDFPGEILRS FIMKCTA 
RBWLRVTTVLFMARAI PAMWPNATLLEKLLEKYMDEDGEWW IA 
KQRG KRAI TDNDM QS I LDLHN KLRSQVY PT ASNMEYMTWD VE LE 
RSAESWAESCLWEHGPASIjLPS IGQNLGAHWGRYRPPTFHVQSW 
YDEVKDFSYPYEHECNPYCPFRCSGPVCTHYTQWWATSNRIGC 
AINIXZHNMNI WGQ I WP KAVYIATCNYSPKGNVWGHAP YKHGRPCS 
ACPPSFGGGCRENIX^TCEGSDRYYPPRBEETNEIERQQSQVHDT 
HVRTRSDDSSRNEVISAQQMSQIVSCEVRLRDQCKGTTCNRYEC 
PAGCLDSKAKVIGS VHYEMQSS I CRAA2HYGI IDNDGGWVDITR 
QGRKHY F I KS NRNG I QT I GKYQ SANS FTVS KVTVQ AVT CETTVE 
QLCPFHKPASHCPRVYCPRKLYASKSTLCSCNWNSSIiF 


6625 


1124 


543 


PGPRGGGGSLLSTKAIjGRSRGLGMHPGPSSGGTEGGVPTALRPP 
GPLVPSTS DDNLLKNI E LFDKLAIjRFHGRIiLFLKDVLGDE I COW 
S FYGOGRK I AEVCCTS I VYATEKKQTKVEFPEARI FEE TLNILI 
YETPRGPDPAIiEATGGAAGAGGAGRGEDBENREHRVRRIHVRR 


6626 


3 


1498 


SAVEFVYTDRFHLILX5ISVEFI*CSLRSDATMES I TAC1»HAL»Q Al» 
LDVPWPRSKIGSDQDSG IELLNVLHRVILTRESPS IQLASLEW 
RQ 1 1 CAAQEHVKEKRRSAEVDDGAAEKETLPEFGEGKDTGGLVP 
GKSLVFATLELCVCILVRQLPELNPKLTGSPGVKATKPQILI.ED 
GSRLVSAALVILS ELPAVCSPEGS IS ILPTII. YLTIGVLRETAV 
KLPGGQLSSTVAASLQAliKG I LSS PMARABKSRTAWTDIiLRS AL 
TTI LDCWDP VDETHQELDEVSLLTAI TVF I LSTS PEVTT I PCLQ 
KRCIDKFKATLEI KDPWQ I KTYQLLHS I FQ YPNPAVS Y P Y I YS 
LASCIMEKLQEIDKRKPENI7AELEI FQEGIKVLETLiVTVAEEHH 
RAQLVACLLP I LISFLLDENSLGSATS IMRNLHDFALQNLMQIG 
PQYSSVFKSLVASSPAIiKARLEAAIKGNQBSVKVKIPTSKYTKS 
PGKNSSXQLKTSFl* 


6627 


1 


697 


G I PHI>SSRDMTGTPGAVATRDGEAPERSPPCSPSYDLTGKVMIiL 
GDTG VGKTCFL. I Q FKDG AFI*S GT F IATVG I D FRNKVVTVDG VRV 
KLQ I WDTAGQE RFRS VTHAYYRDAQALLLL YD I TNKSS FDNIRA 
WLTEIHEYAQRDWIMLLGNKADMSSERVIRSEDGETLAREYGV 
PFI^TSAKTGMNVELAFIAIAKELKYRAGHQADEPSFQIRDYVE 
SQKKRSSCCSFM 


6628 


1 


1861 


QCAE FGGGSGGGGGSGGGGSGGGRG AGGEENKENERPSAGS KAN 
KEFGDSLSLEILQI IKESQQQHGIjRHGDFQRYRG YCS RRQRRLR 
KTLNFKMGNRHKFTGKinn'EEIiLTDimYLIi 
QLKQEANTEPRKRFHLLS RiRiQWKHAEELERLCESNRVDAKTK 
LEAQAYTAYLSGMLRFEHQE W KAAI EAFNKCKT I YEKLASAFTE 
EQAVLYNQRVEE I S PNIRYCAYNIGDQSAINELMQMRLRSGGTE 
GLLAEKLEALITQTRAKQAATMSEVE^GRTVPVKIDKVRI FXL 
GLADNEAAI VQAE SEETKERLFESMLS ECRDAI Q WREELKPDQ 
KQRD Y ILEX5EPGKVSNLQYLHS YLTYIKLSTAI KRNEKMAKGLQ 
RALLQQQPEDDSKRSPRPQDL.IRLYDI ILQNLVEIjLQLPGLEED 
KAFX3KEIGLKTLVFKAYRCFFIAQSYVLVKKWSEALVLYDRVLK 
YANEVNSDAGAFKNSLKDLPDVQELITQVRSEKCSLQAAAILDA 
NDAHQTETSSSQVKDNKPLVERFETFCLDPSLVTKQANLVHFPP 
GFX3PIPCKPLFFDI*AI^JHVAFPPI^DKLEQKTKSGLTGYIKGIF 

GFRS 


6629 


5653 


4549 


GATPLGS VGGRTGKMDAATLT YDTLRFAEFED FPETSE PVW I LG 
RKYS I FTEKDE I LS DVAS RLWFT YRKNFPAI GGTGPTSDTGWGC 
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ID 
NO: 
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beginning 
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corresponding 
to first 
amino acid 
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amino acid 
sequence 


Predicted end 
nucleotide 
location 
corre spending 
to first 
amino acid 
residue of 
amino acid 
sequence 


Anu.no acid segment containing signal peptide 
(A=Alanine , C= Cysteine , D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine , G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
u- Leucine, M=Metnionine, N=Asparagme , 
P=Proline, Q-Glutamine, R=Arginine, 
S=Serine, T=Threonine , V=Valine. 
W=Trypt ophan , Y- Tyrosine , X»Unknown, * ° S top 
Loaon, /=pos3iole nucleotide deletion, 
\=possible nucleotide insertion) 




• 




MliRCGQMI PAQAIiVCRHLGRDMRWTQRKRQPDSYFSVIiNAFIDR 
KDSYYSrHQIAQMGVGEGKSIGQVryGPNTVAOVLKiCIAVFDTWS 
SLA VH I AMDNTWMEEI RRLCRTS VPCAGATAFPADSDRHCNGF 
PAGAEVTNRPS P WRPLVLLI PLRLGLTDINHAYVETLKHCFMMP 
QSLGVIGGKPNSAHYFIGYVGEELIYLOPHTTQPAVEPTDGCFI 
PDBSFHCQHPPCRMSIAELDPSIAVTOGGHI^TQAPGAECC^GM 
TRKTFG FLRFFF S MLG 


6630 


2 


423 


• LVQCGGIRRRS AWGAMPGRHVSRVRALYKRVLQLHRVLPPDLKS 
LGIXJYVKDE FRRHKTVGSDEAQRFLQKWKVYATALLQQANENRQ 
NSTGKACFGTFLPEEKLNDFRDEQIGQLQELMQEATKPNRQFSI 
SBSMKPKF 


6631 


2 


423 


LVQCGGIRRRSAWGAMPGRHVSRVRALrKRVLQLHRVLPPDLKS 
LG DQ YVKD E FRRHKTVG S DEAQR FLQ E WEVYATALLQQ ANENRQ 
NSTGKACFGTFLPEEKLNDFRDEQI GQLQELMQEATKPNRQFS I 
SBSMKPKF 


6632 


1273 

• 


588 


WNSRGRTQRGAAPLAPAAAMKAWQR VTRAS VTVGGEQ I S AIGR 
GI CVLLGISLEDTQKBLEHMVRKILNLRVFEDESGKHWS KSVMD 
KQYEI LCVSQFTI*QCVLKGNKPDFHLAMPTEQAEGFYNS FLEQL 
RKTYRPELIKDGKFGAYMQVH IQNDG PVTIELES PAPGTATSDP 
KQLSKLEKQQQRKEKTRAKGPSESSKERNTPRKEDRSAS SGAEG 
DVSSEREP 


6633 


1145 


617 


ATGRHEGVPTLEGI IQQLVKGI ITPATI PSLGPWGVLHSNPMD Y 
AWGANGLDAI ITQLLNQFENTGPPPADKEiaQALPTVPVTEEHV 
GSGLE CP VCKD D YALGER VRQLP CNHLFHDG CIVPW LEQHDS CP 
VCRKSLTGQNTATNPPGLTGVSFSSSSSSSSSSSPSNENATSNS 


6634 


1 


1134 


CGGI PRXGSGPRRRLPMARLRBCLPRLMLTLRSLLFWSLVYCYC 
GLCAS I HLXiKLLWS LGKGPAQTFRRPAREHP PACLSDPS LGTHC 
YVRIKDSGLRFHYVAAGERGKPLMLLLHGFPEFWYSWRYQLREF 
KSEYRWALDLRG YGETDAP IHRQNY KLDCL ITD I KDILDSLG Y 
SKCVL I GHDWGGM XAWLI A I CYPEMVMKLiI VINFPHPNVFTEYI 
iJ?HPAQLLKSSYYYFFQIPWFPEFMFSINDFKVLKHLFTSHSTG 
IGRKGCQLTTEDLEAYI YVFSQPGALS G P INHYRNI FSCLPLKH 
HMVTTPTLLLWG ENDAFMEVEMAE VTR FYVKNYFRLTILSEASH 
WLQQDQPDI VNKL I WTFLKEETRKKD 


6635 • 


1420 


470 


EMRAGQQLASMLRWTRAWRLPREGLGPHGPSFARVPVAPSSSSG 
GRGGAEPRPLPLSYRIjI^EAAIjPAVVFLHGLFGSKTNFNS iak 

iiagotgrrvltvdarnhgds phspdms yeimsqdlqdllpqlg 
lvpcvwghsmggktamllalqrpelverliavdis pvestgvs 

HFATWAAMRAJNIADELPRSRARKLADEQLSSVIQDiMAVRQHL 
LTNLVEVTX3RF^/WRVNIJ3ALTQH1J3KIIJVFPQRQESYI/3PTLFL 
LGGNSQFVHPSHHPEIMPJ^FPRAQMQTVPNAGHWIHADRPQDFX 
AAIRGFLV 


6636 


1S14 


1801 


S FCM FSHKQDSKFQAVPVQEKKKRLRRAP WRAFAQPQRLKH PAE 
Qfl VKvCiiUKr'lriJv.v* VLwr V U^vu"" i>lAjir v uz>rnoi/run^n. v u 
DGGDGVF 


6637 


2 


1501 


CSSSPCFHIXrTCVLDKAGSYKCACLAGYTGQRCElsrLLEAGKSKI 
KASEDSLSVLEERNCSDPGGPVNGYQKITGGPGLINGRI1AKIGT 
WS FFCNNS YVLSGNE KRTCQQNGEWSGKQ PICI KACREPK1 SD 
L VRRRVLPMQVQSRETPIjHQLYS AAFS KQKLQSAPTKKPAliPFG 
DLPMGYQHLHTQLQYSCISPFYRRLGSSRRTCLRTGKWSGRAPS 
CIP ICGKI ENITAP KTQGLRW PWQAAI YRRTSGVHDGSIjHKGAW 
FLVCSGALVNERTVVVAAKCVTDLGKVTMIKTADLK^ 
DDDRDEKT IQSLQI S AX ILHPNYDP I LLDAD IA1 LKLLDKAR I S 
TR VQ P I CLAAS RDLS TS FQES H I T VAG WNVLADVRS PG FKNDTL 
RSGWSVVDSIiLCEEQHEDHGIPVSVTDNMFCASWEPTAPSDIC 
TAETGGIAAVS F PG RAS PEPR WHLMGL VS W S YDKTCSHRLSTAF 
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ID 

NO: 



6638 



Predicted 
beginning 
nucleotide 
location 



Predicted end 
nucleotide 
location 
corre sponding 



co 



rresponding I to first 



to first 
amino acid 
residue of 
amino acid 
sequence 



amino acid 
residue of 
amino acid 
sequence 



1391 



224 



6639 



2046 



1268 



117 



1043 



6641 



894 



6642 



22 



1296 



6643 



3049 



2265 



6644 



1489 



290 



Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D= As par tic Acid, E= 
Glutamic Acid, F- Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=I»ysine, 
L=I*eucine, M=Methionine , N=Asparagine, 
P= Proline, Q=Glutamine, R=Arginine, 
S-Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknovn, *=Stop 
Codon, /".possible nucleotide deletion, 
\=possible nucleotide insertion) 



TKVLPFKDWIERNMK 
"GGIPQAGGKMAAPWWRAALCECRRWRGFSTSAVLGRRTPPLGPM 
PNSDIDLSNl.ERLEKYRSFDRYRRRAEQEAQAPHWWRTYREYFG 
EKTDPKEKIDIGLPPPKVSRTQQLLERKQAIQELRANVEEERAA 
RIxRTASVPIxDAVRAEWERTCGPYHKQRXiAEYYGLiYRDIjFHGATF 

V P RV PliHVAYAVGE DDLM P VYCGNEVT PTEAAQ AP EVTYEAEEG 
SLWTLIJyTSIJXJHLI^PDAEYIaHV^^ 

PPFPARGSGIHRU'UTJiFKQDQPIDFSEnARPSPCYQIiAQRTFR 
TFDFYKKHQETMTPAGLSFFQC3RWDDSVTYIFHQLLDMREPVFE 

FVRPPPYHPKQKRFPHRQPLRYUDRYRDSHEPTYGIY 

IGCFI(4DGGDDGNLIIKK RFVSEAELDERRKRR0gEWEKVRKPE 
DPEECPEEVYDPRSLYERLQEQKDRKQQEYEEQFKFKNMVRGLD 
EDETNFLDEVSRQQELIEKQRREBELKELKEYRNNLKKVGISQE 

NKKE VEKKLTVKP I ETKNKFS QAKLIiAGAVKHKS S ESGNS VKRL 
KPDPEPDDKNQEPSSCKSLGNTSLSGPSIHCPSAAVCIGIIjPGL 

GAY S GS SDSE5 SS DSEGT INATGKI VS S I FRTNTFLEA P 

VLEPPDVSMAESEDRSLRIVLVGKTGSGI 
RIAAQAVTKNCQKASREW(^RDLLVVDTPGLFDTKESIJ5TTCKE 

ISRCI I S SCPGPHATVLVLIiLGRYTEEEQKTVAIi I KAV FG KSAM 
KHMVILFTRKEELEGQSFlIDFIADADVGIiKSIVKECGNRCCAFS 
NS KKTSKAEKESQVQEIiVEL I EKMVQ CNEGAYFSDDI YKDTEER 
LKQREEVLRKI YTDQLNEE I KLVBEDKHKSEE KKEKEI KLLKLK 
YDBKIKN1REEAE RNIFKDVFNRIWKMLSEIWHRFLSKCKFYSS 
rGRRSEVRGCAPRPRLRRSARRMDPVPGTDSAPLAGlAWSS 

ASAP PPRGFS AI S CTVEGAPASFGKS FAQKSGY FliCLS SLGSLE 
NPQENVVADIQIVVDKSPLPI^FSPVCDPMDSKASVSKKKRMCV 
KLLPLGATDTAVFDVRLSGKTKTVPGY1.RIGDMGGFAIWCKKAK 

APRP VPKPRGI*SRDMQGI*S LDAASQP S KGGLIiERTAS RLG SRAS 
TLRRNDSIYEASSIiYGISAMDGVPFTIjHPRFEGKSCSPIjAFSAF 

GDI*TI KSIADIEEEYNYGFWEKTAAARIjPPSVS 



> AT ANT I LGE E J FDS 



PI^RMMTKMDPNTM^QR DIIFEIjmiAFPAESDPaNAF^a^TE 
KRKAMYTKDYTCMLGFTNHINPAMDFTQTPPGMI^ 
HQDTY I RIVI»ENSSREDKHEC PFGRS AI EI*T KMLCE I LQVG E L*P 

negrndyhpmffthdrafeelfgiciqianktwkemrataedfn 

KVMQWREQ I TRALPSKPNSLDQFKS KLRSLS YS E I LRliRQS ER 

MSQDDFQSPPIVEIjREKIQPEILEIjIKQQRIiNRIjCEGSSFRKIG 

NRRRQERFWYCRIiAIiNHKVX»HYG13LDDNPGGEVT 

ADI KAI VTGKDCPHMKEKSALKQNKE VIjEIjAFS I L.YDPDETLNF 

IAPNKYEYCIWIlX?I^ALI^KD^SELTKSDIXKriJ^MEMKUU> 

LDLENIQIPEAPPPIPKBPSSYDFVYHYG 



J LHAP AEGRTRGRIAB KP KMLTRK I KXi WD INAH X TCRLC5 G YLt I 
DATTVTEClOTFCRSCLVKYIaEENOTCPTCRIVIHQSHPliQYIG 

HDRTMQDI VYKLVPGIiQEAEMRKQRE FYHKLGMEVPGD I KGETC 
SAKQHLiDSHRNGETKADDSSNKEAAEEKPEEDNDYHRSDEQVSI 
C^ECNSSKlJiGLKRKWIRCSAQATVIJnJCKFIAKKi^ 
DILCNEEir^KDHTUCFVVVTRWP>FXKAPIJCX 

FRPIiATEP RGS S P VQLVS STMS VRTI*P LLFLNIjGGEMXjY I LDQR 
LRAQNIPGDKARKVIjNDIISTMFNRKFMEEI*FKPQELYSKKALR 

TVYERLAHAS IMKIjNQASMDKLYDLMTMAFKYQV1jL.CPRP KDVIj 

lvtfnhldtikgfirdsptilqqvdetlrqi»teiyggi*sagefq 

LIRQTIj1>I FFQDLHI RVSMFLKDKVQNNNGRFVIjPVSGPVPWGT 

evpglirmfnnkgeevkriefkhggnyvpapkegsfefygdrvl 

KIXSTNMYSWQPVETHVSGSSKNIASWTQESIAPNPIJaCEEIJaF 

iarlmggmeikkpsgpepgfri^fttdeeeeqaaltrpe^ 

EVIN 
DEL 



IQATQDQQRSEELARIMGEFE ITEQPRIiSTSKGDDLIAMM 
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ID 
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beginning 
nucleotide 
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corr e sponding 
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amino acid 
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amino acid 
sequence 


Predicted end 
nucleotide 
location 
corre sponding 
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amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I-Isoleucine, K~Lysine, 
L=Leucine, M^Methionine, N=Asparagine, 
P=Proline t Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
w=Tryptophan, Y«Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 


6645 

• 


6530 


4646 


FVEGLAG YVYKAAS EG KVI/TIiAA I tT »T iNRS ESDIR YliLG YVSQQG 
GQRSTPLI IAARNGHAKVVRIJ^LEHYRVQTQQTGTVR FDGYV ID 
GATALWCAAGAGHFEVVRliLVSHGANVNHTTV^ 
GRLDIVKY LVBNNAN I S I ANKYDNTCLM IAAYKGHTD WRYT ,T.K 
QRAD PNAKAHCG ATAliHFAAEAG H I D IVKELI KWRAAIWNGHG 
^PLWAAESCKADVVELLI>SHADCDRRSRIEAI^LLGASF7^^ 
REN YD I IKTYHYLYIiAMLERFQDGDN I LEKEVLPPIHAYGNRTE 
CRNPQEI^ES IRQDRDAIjHMEGI*I VRERILGADNIDVSHP 1 1 YRG 
AVYADNMEFEQCI KLWbHAIiHIiRQKGNRNTHKDLLRFAQVFSQM 
IH^NETVKAPDIECVLRCSVLE IEQSM^^ 

YECNL YTFLYLVCI STKTQCS EEDQCKINKQ IYNLIHLDPRTRE 
GFTIJjHLAVNSNTPVDDFHTNDVCSFPNALVTKI^I^CGAEVNA 
VDNEGNSALHI IVQYNRPISDFLTLHSII ISLVEAGAHTDMTNK 
QNKTPLDKSTTGVSE IliLKTQMKMSLKCIAARAVRANDINYQDQ 
I PRTLEEFVGFH 


6646 


176 


890 


PSSRMNHLPEDMENALTGSQSSHASIJWIHS INPTQLMAR IES Y 
EGREKKGI SDVRRTFCI^FVXFDLLFVTLLWI I ELNVNGG I ENTI* 
EKEVMQYDYYSSYFDIFLLAVFRFKVLILAYAVCRLRHWWAXAL 
TTAVTSAFLLAKVTLSKLFSQGAFGYVLPI IS FI LAW IETWFLD 
FKVLPQEAEEENRH»Z VQDAS ERAAL I PGGLS DGQFYS PPESEA 
GSEEAEEKQDSEKPLLEL 


6647 


176 


890 


PSSRMlOTIiPEDMENADTGSQSSHASIiRNIHSINPTQIiMARIESY 
EGREKKGlSDVRRTFCl*FVTFDIiLFVTLI*WI IELNVKGG I ENTL» 
EKEVMQYDYYSS YFDI FLLiAVFRF KVXi 1 I> AYAVCRL RHW W AI AL» 
TTAVTSAFLLAXVXIjSKLFSQGAFGYVLPI ISFIIiAWIETWFLD 
FKVL PQEAEEENRXJCiI VQDASERAAIj I PGGLS DGQFYS PPE SEA 
GSEEAEEKQDSEKPULEL 


6648 


413 


897 


RNCWNCFTKYFNSPPED IDHKDSYLI TRSIMAEPDYI EDDNPEIi 
IRPOKLlNPVKTSRNHQDIiHREIiLMWQKRGIiAPQNKPBLQKVME 
KRKRDQVI KQKEEEAQKKKSDLE IE LLKRQQKIiEQLELE KQKLQ 
EEQENAPEFVKVKGNLRRTGQEVAQAQES 


6649 


1357 


832 


WI PRAAGIRHE VKWDVKEI MSQHN I YVDALLKEFEQFNRRLNEV 
SKRVRIPLPVSNILWEHCIRLANRTIVEGYANVKKCSNEGRALM 
QLDFQQFLMKLEKLTDIRPI PDKEFVETYIKAYYLTENDMERWI 
KEHRE YSTKQLTNfliVNVCLGSHINKKARQKLLAAIDD IDR PKR 


6650 


32 


765 


LtVPL VFS LLVQS CKQVYRS IAMKFVPCLlLLVTLS CLGTLGQAPR 
QKQGSTGEEFHFQTGGRDSCTMRPSSIX^QGAGEWIjRVDCRNTD 
QTYWCEYRGQPSMCQAFAADPKSYWNQALQEIiRRLHHACQGAPV 
LRPSVCREAGPQAHMQQVTSSDKGSPEPNQQPEAGTPSIjRPKAT 
VKLTEATQI/SKDSMEEIiGKAKPTTRPTAKPTQPGPRPGGNEEAK 
KKAWEHCWKPFQALCAFLI s ffrg 


66S1 


3425 

* 


1353 


AKEL1/KVGDFSI/CAGP YQNTADTMENZiS KEPIiAS FVSESFDISA 

cgiatehvkidnsgegltaeagsetlsrdgevgvnsdmhyelsg 

DSDLDIiliGDCRNPRLiDLEDSYTIiRGSYTRKKDVPTDGYESSIiNF 
HNNNQEDWGCSSWVPGMETSL.PPGHWTAAVKKEEKCVPPYVQIR 
DIiHGILRTYANFS ITKELKDTMRTSHGLRRHPS FSANCGLPSSW 
TSTWQVADDLTQNTLDLEYLRFAHKLKQTIKNGDSQHSAS SAMV 
FPKES PTQIS IGAFPSTKISEAPFXHPAPRSRS PLIiVTVVESDP 
RPQGQPRRGYTASSIJ)SS SS WRERCSHNRDLRNSQRNHTVSFHL. 
NKLlKYNSTVKESRITOISIiIIjNEYAEFNKVMKNSNQFIFQD 
DVSGEATAQEMYLPFPGRS AS YEDI 1 1 DVCTNIJHVKX.RS VVKEA 
CKSTFIiFYIiVETEDKS FFVRTKNLIiRKGGHTE I E PQHFCQAFHR 
ENDTLI 1 1 1 RKED I S SHLHQ I PSLLKLKRFPS V I FAGVDS PGDV 
LDHTYQELFRAGGFVISDDKI IjEAVTLVQLKEI IKILEKLNGNG 
RWKWIjLHYRENKKI*KEDERVDSTAHKKNI mlks fqsanii ei*lh 
YHQCDSRSSTKAE IIJCCLLNLQIQHIDARFAVLLTDKPTI PREV 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
correspond! ng 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
I>=Leucine, M=Methionine , N=Asparagine, 
P=Proline, Q=Glu tamine , R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X= Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








FBNNGILVTDVNNFI ENIEKIAAPFRSS YW 


6652 


2 


1343 


IPGSTISCSCHSRRLRGGSPAPRLSLGAASPRPRPPSLPLPLPL 
PFPLFLPTRPAERAWIRSRRASEWVGKMEVPRIJDHALNSPTSPC 
EEVIKNLSLEAIQLCDRDGNKSQDSGIAEMEELPVPHNIKISNI 
TCDSFKISWEMDSKSKDRITHYFIDLNKXENKNSNKFKHKDVPT 
KLVAKAVPL PMTVRGHWFLS PRTE YTVAVO/TASKQVDGDYVVS E 
WSE I IE FCTADYSKVHLTQLLEKAEVI AGRMLKFS VFYRNQHKE 
YFD YVREHHGNAMQP S VKDNSGSHGS P I SG KLEGI F FSCSTEFN 
TGKPPQDSPYGRYRFEIAAEKI»FNPNTNI*YFGDFYCMYTAYHYV 
I LVI APVGS PGDEFCKQRLPQLNSKDNKFLTCTEEDGVLVYHHA 
QDVI LEVI YTD P VDLS LGTVAE I TGHQLMS LSTANAKKDPSCKT 
CNISVGR 


6653 


170 

• 


1910 


FFIiEPRIiRPFPASRARFVPARTRPSPIjHPCCFCFEGGGSMLSPQ 
R VA A A A^ RG ATin AMF S S K PG PVOVVLVOKDOHS FELDEKALAS I 
LLQDHIRDIiDWWS VAGAFRKGKS FIL.DFMLRYLYSQKESGHS 
NWI^DPEEPLTGFSWRGGSDPETTGIQIWSEVFTVEKPGGKKVA 
VVI^TOGAFDSQSTVKDCATIFALSTMTSSVQIYNLSQNIQED 
DLQQLQLFTE YGRLAMDEI FQKP FQTLMFLVRDWSFPYEYS YGL 
QGG MAFLDKRl^VKEHQHEEIQNVRNHIRSCFSDVTCFLLPHPG 
T.OVATR PDFDGirr.KDlAf5EFKEOLiOAIjI PYVLNPSKLMEKEING 
SKVTCRGLLE YFKAYI KI YQGEDLPHPKSMLQATAEAYNLAAAA 
SAKDIYYNNMEEVCGGEKPYIiSPDIIjEEKHCEFKQIALDKFKKT 
KKMGGKDFS FR YOOELEEEI KEL YENFCKHNGS KNVFSTFRTPA 
VIjFTG I VAL Y IASGIiTGF IGLE VVAQLFNCMVGLLIiIALLTWGY 
I RYSGQ YRELGGAIDFGAAYVLEQAS SHIGNS TQATVRDAWGR 
PSMDKKAQ 


6654 


1 


705 


RTSI*SPSQCSSFNLAMASAGMQII^VVIiTIJ^WWGIiVSCALPM 
WKVTAFIGNS IVVAQVVWEGI^MSCVVTQSTGQMQCKVYDSLI^AL 
PQDLQAARALCVIAIJbVALFGIjLVYI^GAKCTTCVEEKDS KARL 
VLTSGIVFVISGVLTLIPVCWTAHAVIRDFYNPLVAEAQKRELG 
ASLYI>GWAASGLLLI^GGIJ^CCTCPSGGSQGPSHYMARYSTSAP 
AISRGPSEYPTKNYV 


6655 


341 


16 


KDAYM FKKGLIALALVFS LP VFAAEHW IDVR VPEQ YQQEHVQGA 
INIPLKEVKERIATAVPDKtJDTVKVYCNAGRQSGQAKEILSEMG 
YTHVENAGGLKDIAMPKVKG 


6656 


2 


1212 


TELPPRFANIJVIQPPLSPLRALAPLPEKPGAVP.PPQKRMAKVAK 
DI^PGVKKMSLGQLQSARGVACLGCKGTCSGFEPHSWRKICKSC 
KCSQEDHCXTSDLEDDRKIGRIiKDSKYSTLTARVKGGDGIRIY 
KRNRMIMTN P IATCKDPTFOTITYEWAPPGVTQKLGLQYMEIjIP 
KEKQPVTGTEGAFYRRRQLMHQLP I YDQDPSRCRGLLENELKLM 
EBFVKQYKSEAiGVGEVALPGO^LPKEEGKQQEKPEGABTTAA 
TTNGSLSPPSKEVEYVCELCKGAAPPDSPWYSDRAGYNKQWHP 
TCFVCAKCSEPLVDLI YFWKDGAPWCGRHYCESLRPRCSGCDE I 
I FAED YQRVEDLAWHRKHFVCEGCEQLLSGRAY 1 VTKGQLLCPT 
CSKSKRS 


6657 


830 


2120 


LLTCQERAGDCLLSASTMKEWYWSPKKVADWLLENAMPEYCEP 
LEHFTGQDL INLTQEDFKKPPLCRVS S DNGQRLLDMI ETLKMEH 
HLEAHKNGHANGHLNIGVDI PTPDGS FS IKIKPNGMPNGYRKEM 
I KI PM PELERS QYPMEWGKTFLAFIiYAIjS CFVLTTVMI SWHER 
VPPKEVQPPLPDTFFDHFNRVQWAFS I CE INGMILVGLWLIQWL 
LLKYKS I ISRRFFCI VGTLYLYRCITMYVTTLPVPGMHFNCS PK 
LFGDWEAQLRRIMKLIAGGGLS ITGSHNMCGDYLYSGHTVMLTL ■ 
TYLFIKEYS PRRLWWYHWI CWLLS WG I FCI LLAHDHYTVDVW 
AYYITTRLFWWYHTMANQOjVLKEASQ 
NVQG I VPRS YHWP FPWPWHLS RQVKYSRLVNDT 


6658 


35 


855 


HCCAIjGAPGSPYRGLYFSSAAPCTAPRKAKHQSTLEGLTKRMLM 
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Predicted 
beginning 
nucleotide 
location 
corre spending 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A»Alanine, C»Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F« Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glutamine , R^Arginine, 
S=Serine, T=Thr eonine , V= Valine, 
W= Tryptophan, Y=Tyrosine, X= Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\*»possible nucleotide insertion) 








FDP VP VKQBAMDP VS VS Y PSN YMESMKPNKYGVT YST PI*PEKF F 
QTPEGIjSHGIOMEPVDLTVNKRSSPPSAGNSPSSLKFPSSHRRA 
SPGLSMPSSSPPIKKYSPPSPGVQPFGVPLSMPPVMAAALSRHG 
IP^PGILPVIQPVVVQPVPFMYTSHLQQPLMVSLSEEMENSSSS 
MQVPV1 ES YEKPISQKKIKIEPG I EPQRTDYYPEEMS P PIjMNS V 
SPPQALLQE 


6659 


18 


523 


E PQRGDCETWFQNCS LPKFVCFFCWGFWLWRAHSMSNLHSIiPGL 
RGLTS I SRNQLQCTNAMRVI NNYQRRWKNQNTFIjIiAT FANWNV 
CGNPTI TCPHNRTLNNCHHSGVQVPLMYCflLTTPS PQN I SNCRY 
AQTPANMFYIVACDNRDQRRDPPQYP WPVHLHTI I 


6660 


514 


1707 


CAAS LDCRHHLCE PDMXLVWP S AKLLQAAAGAS ARACD S VTSNV 
LPLLIaEQFHKHSQSSQRRTI LEMIiLGFLKLQQKWSYEDKDQRPL 
NG FKDQI*CS I»VFMALTDPSTQI*QI*VG IRTLTVXjGAQPDLLS YED 
LElAVGHIjYRIiS FIjKEDSQSCRVAALEASGTLAALY P VAFSSHI* 
V^KIiAEEIiRVGESNLTNGDEPTQCSRHIiCCLQALSAVSTHPS IV 
KETLPLLLQHLWQ VNRGNMVAQS SDV I AVCQSLRQMAE KCQQDP 
ESCWYFHQTAI PCLLAIAVQASMPEKEPSVLRKVLLEDEVIAAM 
VSVIGTATTHLSPELAAQSVTHIVPLFLDGNVSFLPENSFPSRF 
QPFQIXSSSGQRPOilAIAMAFVCSLPPJSWSEHIWEVI^FNIiDKVT 
PG 


6661 


175 


430 


GVHAASGTLSATWJLAEAKMFDSIJUCAGKYLGQAAKLM I GMPDYT) 
NYVEHMRVNHPDQTPMTYEEFFRERODARYGGKGGARCC 


6662 


185 


423 


RSIjP KPAPAQPAS IHCARFSGVT P PTAKTAMSDGNTAFNALM YC 
G P KAD DGN I FS ACAP AS SAVKAS VS VAQ PGQAVT P 


6663 


3 


1005 


RPVI^SRVDDFVPPLPETSGI^KKI-ERMYSVDRVSDDI PIRTWF 
PKENLFSFQTASTTMQAISNFRKRLRMVGSRRVKAQTFAERRER 
SFSRSWSDPTPMKADTSHDSRDSSDLQSSHCTIjDEAFEDLDWDT 
EKGLEAVACDTEGFVP PKVMIil S S KVPKAEY I PTI IRRDDP S 1 1 
PILYDHEHATFEDILEEIERKLNVYHKGAKI WKMLI fcqggpgh 
I»YI»LKNKVATFAKVEKEEDMIHFWKRI*SRLMSKVNPEPNVIHIM 
G CY I LGNPNGEKLFQNIjRTIJ^TP YRVTFES PLEtiSAQGKQMI ET 
YFDFRLYRLWKSRQHS KLLDFDDVL 


6664 


58 


968 


PRlJJUiPRSVVVMDSPWDEIAIAFSRTSMFPFFDIAHYI,VSVMA 
VKRQPGAAALAWKNPI SSWFTAMLHCFGGGI I*S CLLLAEPPLKF 
IiANHTNILLASS IWYITFFCPHDLVSQGYSYLPVQIiLASGMKEV 
TRTW KI VGG VTHANS Y Y KNG W I VM I AI GWARG AGGTI ITNFERL. 
VKGDV7KPEGDEWLKKSY PAKVTLLG S VI FTFQHTQHIiA 1 S KHNL 
MFIiYTI FIVATKITMMTTG^STMTFAPF^DTLSWMLFGWQQPFS 
SCEIOCSEAKSPSNGVGSIiASKPVDVASDNVKKKHTKKNE 


6665 


171 


1278 


DERRIACRQWTQQRSELYPGFQKRQRFLPKAGEEAAAQGGRHL 

pgrwlgpgctqnpcsvhtatgpeprklpllppdspnsgypkepa 
aix:pgipspcrmthqdlsitaklinggvaglvgvtcvfpidlak 

TRIiQNQHGKAM YKGMI DCIiMKTARAEGFTGM YRG AAVNLTLVTP 
EKAI KLAAXTOFFRRJ^EDGMQRl^K^MIAGCGAGM 
P^MLKIQLQDAGRLAVHHQGSASAPSTSRSYTTGSASTHRRPS 
ATLIAWELI^TQGIAGLYRGI/^TIiRDIPFSIIYFPLFAIJlrNN 
LGFNELAGKAS FAHS FVSG CVAG S IAAVAVTPLDVLKTR I QTLK 
KGIX3EDMYSGI TDCAR 


6666 


498 


2868 


MTTFIjPVPQMMAGFSFGTFGNPPMESPSAWQTIHQPFIVSCLTL 
WS PGCW PQPIQ KEG VGLWD I R KPQS SI»LR YGGN1»S LQS AMS VRF 
NSNGTQLLALRR RL PP VL YD IHS RI» P VFQFDNQ VY FN S CTM KS C 
CFAGDRDQY ILSGSDDFNLYMWRI PADP EAGG I GRWNG AFMVL 
KGHRS IVNQVRFNPHTYMICSSGVEKI I KIWSPYKQPGCTGDLD 
GRIEDDSRCLYTHEEYISLVXiNSGSGLSHDYANQSVQEDPRMMA 
FFDSIiVRREIEGWSSDSDSDI>SESTILQLHAGVSERSGYTDSES 
aASLPRSPPPTVDESADNAFHI^PI^VTTTNTV7ASTPPTPTCED 
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S2Q 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location . 
j corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A= Alanine , C=Cyeteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G-Glycine, 
H=Histidine, I=Isoleucine, K= Lysine , 
L=Leucine, M=Methionine, N=Asparagine , ; 
P= Proline , Q=Glut amine , R=Arginine, 
S=Serine, TcThreonine, V= Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, ! 
\=possible nucleotide insertion) 








AASRQQRLSALi^YQDKiUJLiALSNE^DSEENVCEVEL.DTDL.FPR 
PRSPSPBDESSSSSSSSSSEDBEELNERRASTWQRNAMRRRQKT 
TREDKPSAP I KPTNT Y I GEDNYD YPQ I KVDDLSSS PTSS P ERS T 
STLEIQPSRASPTSDIESVERKIYKAYKWLRYSYISYSNNKDGE 
TS1*VTGEADEGRAGTSHKDNPAPSSSKEACLNI AMAQRNQDL.P P 
EGCSKIXrPKEETPRTPSKGPGHEHSSHAWAEVPECTSQDTGNSG 
SVFJIPFETKKI^GKALSSRAEEPPSPPVPKASGSTLNSGSGNCP 
RTQSDDSEERSIjETICANHNNGRLHPRPPHPHNNGONLGEIiEVV 
AYSSPGHSDTDRDNSSLTGTLLHKDCCGSEMACETPNAGTRHDP 
TDT PATDS S RAVHGHSG LKRQR I E L3DTDS ENSSS EKKLKT 


6667 

• 


171 

• 


1310 

• 


ABEVERIiAAMRS DSLVPGTHTPPI RRRSKFANLGRI PKPWKWRK 
KKSEKFKHTSAALERKISMRQSREEZjIKRGVLKEIYDKDGSLSI 
SNEE1)SLENGQSLSSSQLSI»PAI>SEMBPVPMPRDPCSYEVIjQPS 
DIMIX3PDPGAPVK1>PCXPVKI*SPPLPPKKVMI CMPVGGPDLSIiV 
S YTAQ KSGQQGVAQHHHTVLPSQI QHQUQYGSHGQHI*PS TTGSL 
PMHPSGCRMIDE1*NKTIAMTMQRLESSEQRVP CSTSYHS SGLHS 
GDGVTKAGPMGLPEIRQVPTWI ECDDNKENVPHESDYEDSSCL 
YTREEEEEEEDEDDDSSLYTSSLAMKVCRKDSLAI KPSNRPSKR 
ETLEEKNI LPRQTDEERLEbRQQIGTKL 


6668 


714 


358 


TLAVATGPALTLRCHVCTS SSNCKHS WCPAS SRFCKTTNrVEP 
LRGNLVKKDCAE S CTPS YT LQGQVS S G TS STQ CCQ EDL CNE KLH 
NAAPTRTALAHSALSI/SIiALSIiLAVILAPSIi 


6669 


459 


1207 


KDEETRKDYDYMLDHPEEYYSHYYHYYSRRLAPKVDVRVVI LVS 
VCAI S VFQFFS WWNS YNKAI S YLAT VPKYRI QATE IAKQQGL.LK 
KAKEKGKSKKSKEEI RDBEENI IKNI I KSKIDI KGGYQKPQ I CD 
U^LFQI IIiAPFHLCS YIVWYCRWI YNFNIKGKEYGEEERLYI IR 
KSMKMSKSQFDSLEDHQKETFLKRELW I KENYEVYKQEQEEELK 
K3CLANDPRWKRYRRWMKNEGPGRLTFVDD 


6670 

* 


184 


594 


VARI +GEAAKMSSEP PPPYPGGPTAPLLEEKSGAPPTPGRS S PA 
VMQPPPGMPIiPPADIGPPPYEPPGHPMPQPGFI PPHMSADGTYM 
PPGFYPPPGPHPPMGYYPPGPYTPGPYPGPGGHTATVIjVPSGAA 
TTVTV 


6671 


1 


763 


LPAE KPRS APNMAGGRCJGPQLTALLAAW I AAVAATAGPEEAAIj P 
PEQSRVQPMTASNWTLVMEGEWMLKFYAPWCPS CQQTDSE WEAF 
AKNGE I LQ I S VGKVDVIQEPGLSGRFFVTTIj PAFFHAKDG I FRR 
YRGPGIFEDLQNYILEKJCWQSVEPLTGWKSPASLTMSGMAGLFS 
ISGK1WHIJINYFTV1^IPAWCSYVFFVIATLVFGI>SMDLVL*V 
ISQCNWDPPYRHVS* /RPSTNLGVHTAHTSEHLRL 


6672 


304 


1089 


APGSKP VQ FMDFEGKTS FGMS VFNLSNA I MGSGILGLAYAKAHT 
GVI FFLALLLCIALLSS YS IHLLLTCAG IAG I RAYEQLGQRAFG 
PAGKVWATVX CLHNVGAMSS YLFI IKSELPLVIGTFLYMDPEG 
DWFIiKGKLLI I IVSV1.I HiPLALMKHLGYLGYTSGIaSLTCMLFF 
LVS VI YKKFQLGLCYRATMKQQWES EALVGTPQPRDSTAAVKAQ 
MFHS *LTGVLTQWPIMAFAFVCHPGGAGPSITELCRAFOAQD 


6673 


1116 


1963 


LQ I QTHHTHHGAR VTHLG S HQ LLANAGTMLCRQQS S S MAP A F S Q 
S VTCGP S PC VRKQES ATKCLHI GACGSDLWARGWEQG * G * GXNV 
WLC PCVAFHRGARPQAEEGGARWNSLVS SPW I PPNP * HSS IGAE 
NAVPRP*QG* KVNPSGQERQS \WVLPLPVPGEPLKLPGLPG*NK 
SFSRV/SGSKGKWILPRQLM*AS*R\TPRFVPGTQWVPITW/PI, 
ITWH*SAPTPPLKACPAPRPSDPCSSCLSCPCVTQKPRFSDTGW 
FG AGHCHS S CD FTRKGAAGGPG 


6674 


1 


440 


US FD YMCQ YD Y VEVRDGDNRDGQ 1 1 KR VCGNERPAP I QS IGSSL 
HVLFHSDGSKNFDGFKAIYE^ITACSSSPCFHDGTCVIjDKAGSY 
KCACLAGYTGQRCENI,LEERNCSDPG/WPSQWVPE1SINRGPWAYQ 
PTPC* IGTRVAFFLT 
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ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corre sponding 
to first 
amino acid 
residue of 
ammo acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
{A= Alanine,. C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine , Gs=Glycine, 
H=Histidine, I=Isoleucine, K* Lysine, 
L= Leucine , M=Methionine . N=Asparagine , 
P= Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W= Tryptophan, Y= Tyrosine , X=Unknovn, *=Stop 
Codon, /^possible nucleotide deletion, 
\»possible nucleotide insertion) 


6675 


277 


1678 


GNWPTERMAFLDNPTI ILAHIRQSHVTSDDTGMCEMVLI DHDVD 
LEKIHPPSMPGDSGSEIQGSNGETQGYVYAQSVDITSSWDFGIR 
RRSNTAQRLERLRKERQNQI KCKNI QWKERNS KQS AQELKS LFE 
KKSLKEKPP I SGKQS ILSVRI^QCPIKJLNNPFNEYSKFDGKGHV 
GTTATKKIDVYLPLHS SQDRLLPMTVVTMASARVQDLIGL I CWQ 
YTSEGREPKLNDNVSAyCLHIAEDDGEVDTDFP PLDSNEP IHKF 
GFSTLALVEKYS S PGLTSKESLFVRI NAAHGFSL IQVDNTKVTM 
KEILLKAVKRRKGSQKVSGSRADGVFEEDSQIDIATVQDMLSSH 
HYKS FKVSM I HRLRFTTD VQL/ GCAL FPGVLRKRAAPVDCLRPS 
ADTWRQEQ IGCCGAACAALRS * DS H KC* EG I SGDKVEI D PVTNQ 
KASTKFW I KQKP I S ID S DLLCAC\ DLAEE 


6676 


277 


1678 


GNWPTERMAFLDNPTI I LAHIRQSHVTSDDTGMCEMVL I DHDVD 
LBKIHPPSMPGD3GSEIQGSNGETQGYVYAQSVDITSSWDFGIR 
RRSNTAQRLERLRKERQNQ I KCKNI Q WKEFNS KQSAQELKS L FE 
KKSLKEKP PI SGKQS I LS VRLEQCPLQLNNPFNEYSKFDGKGHV 
GTTATKKIDVYLPLHS SQDRLLPMTVVTMASARVQDLIGLICWQ 
YTSEGREPKLNDNVSAYCLHiAEDDGEVDTDFPPLDSNEPIHKF 
GFSTLAXjVEKYSS PGLTSKESLFVRI NAAHGFSL IQVDNTKVTM 
KB ILLKAVKRRKGSQKVSGSRADGVFEEDSQIDIATVQDMLSSH 
HYKS FKVSM IHRLRFTTDVQL/ GCALFPGVLRKRAAPVDCLRPS 
ADTWRQEQIGCCGAACAALRS*DSHKC*EGISGDKVEIDPVTNQ 
KASTKFW I KQKP I S IDSDLLCAC\DLAEE 


6677 

• 


277 


1678 


GNWPTERMAFLDNPTI I LAHIRQSHVTSDDTGMCEMVLIDHDVD 
LE KI HPPSMPGDSGSEIQGSNGETQG YVYAQSVDITSSWDFG IR 
RRSNTAQRLERLRKERQNQ I KCKN I QWKERNS KQS AQE LKSLFE 
KKSLKEKPP I SGKQSILS VRLEQCPLQLNNPFNEYSKFDGKGHV 
GTTATKKIDVYLPLHSSQDRLLPMTVVTMASARVQDLIGLICWQ 
YTSEGREPKLNDNVSAYCLHIAEDDGEVDTDFP PLDSNEP IHKF 
GFSTLALVEKYSS PGLTSKESLFVRINAAHGFSLIQVDNTKVTM 
KEILLKAVKRRKGSQKVSGSRADGVFEEDSQIDIATVQDMLSSH 
HYKS FKVSM I HRLRFTTDVQL/GCAL FPGVLRKRAAPVDCLRPS 
ADTWRQEQIGCCGAACAALRS*DSHKC*FX3ISGDKVEIDPVTNQ 
KASTKFWIKQKP I S IDSDLLCAC\DLAEE 


6678 


221 


865 


GPSNQS S GSLS L I VTGCS S Y WS * INDTCT ILRVLS S NFGRQ * LR 
PFPCSQLPMSQGCLWHLDCCCPWVPYI PGQQWRKGRQRMRN * QS 
LLGSDQESVGI^DLCVFVNFLLHVLLGLFP* PHELFLLPWDLG 
FLFPLLLQGGCHCLVLPANLVSQAPQ IGKLS CRLQTHDLEGSRN 
HHPLFLWGRWDAVKHLETVQSGLASLGFVGQHTSHGPP 


6679 


2 


786 


LEFARGAMPFLGQDWRS PGQNWVKT VDGWKRFLDEKSGS FVSDL 
SSYCNKEVYNKENXFNSLNYD/SCSQEEKEGHAE*QNQNS\DFH 
QEKWI YVHKGSTKERHGYCTLGEAFNRLDFSTAI LDSRRFNYW 
RLLELIAKSQLTSLSGIAQKNFMNILEKVVLKVLEDQQNITLIR 

ELLQTLYTSLCTLVKRVGKSVLVGNINMWVYRMETI LHWQQQLN 
N I Q 1 1 KV5 GQAQP P PGSGS LHRDToU * KUUror xtrv iDJiouur 


6680 


1498 


2951 


PLCTLPLMPSALPGWAGERW EKQWPLA/ PGPGTWQTPVGS ISEE 
P \RKNEPDTHCPRG EARPEV * HLPKPHS PGSEGAE I QTSA* AL P 
/NQVSPPQPM*GAEENGDQRGGKEEAGEELHRSSSGLTAAPGF? 
EVHRNLQTFPGLPSRGGGP/GGAGTQGSWAPGEQPP/ SPLLPAS 
MQRSQAGLPG WEAGLVES PTHHI PALRPSGTNATGEAFPS TTCS 
SGP\ PAP PGPTGLRPGGGS S SGGHG * +PGLPVGKV\GALGAAQD 
PQS QGRG PTQGTVGTEMLLSGLGS AKAC PAARPAVP * LPSDPAS 
T I P KKGTRG FG EG PG VLQE RNRWWGRAQGFTSADAAGXAP PGV 
* LPAPLSQPPGATEPQVRACGMAPPS PG TSGRL VANGRH PG PQV 
AQGCPPGAGCWGSQPRGSQRCPRTYTHS PLGHGRAP CPRRCWH * 
WQDPPSS PRTGCLPGI PARQAYSAPRTRSRPGIRTGRAAYGFIR 
FQGGGGG 
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SEQ 
ID 
NO: 


Predicted, 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 

nucleotide 

location 

coiTi espuiiCllXly 

to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A= Alanine , C= Cysteine, D= As par tic Acid, E= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
xi— nis i. .Lux ne , i-iouieuciiic, iv.— uyaxtic« 
L=beucine, M=Methionine, N=Asparagine , 
PssProline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Onknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\=rpossible nucleotide insertion) 


6681 


1169 


511 


INYircNQQQRAFHELK\EKLMSAPAI^^ j 
KMTVGVLTQTVGPWSRPGAYLS KQLDGVS KG W P PCPRALAATAL 

TIXCENPHKTI EVSNT / LNP ATLLLVTES P VKHNCLEVLDS VYS 
SRPNLRDHP* TSVDWELYVIX5SGFANPCKVTI*KKETSPAPVTPR 
S 


6682 


109 


123 8 


TVIaCGAMQVSSLNEVKIYSLSCGKSLPEWLSDRKKRALQKKDVD 
VRRRIELI QDFEMPTVCTTIKVSKDGQYI LATGTYKPRVRCYDT 
YQI^IxKFERCIjDSEVVTPEILSDDYSKIVFLHNDRYIEFHSQSG i 
FYYKTR I PKFGR0FS YHYPSCDLYFVGASSEVYRLNLEQGRYXN 
PLQTDAAENNVCDINSVHGIiFATGT lEGRVECWDPRTRNRVGIili 
D\AP # 1L ^SQQIQR*TSIjFTISAIiKrN \C1AI1TrlAVt7 1 1 IvJ^ VJbJji 
DLRSDKPLLVKDHQYGLPIKSVHFQDSLDLIIjSADSRIVKMWNK 
NSGKIFTSLEPEHDLNDVCXYPNSGMIJJTT^NETPKMGIYYIPVli 
GPAPRWCSFLDNLTEELEENPESNE 


6683 


109 


1238 

• 


TVTjCGAMQV S S LNE VK I YS LiSCGKSLi PE VlljSDRKKRAJJj KKDVD 
VRRRIBLIQDFEMPTVCTTIKVSKDGQY1IATGTYKPRVRCYDT 
YQLSLKFERCLDSEVVTFEILSDDYSKIVFIjHNDRYIEFHSQSG 
FYYKTRIPKFGRDFSYHYPSOTLYFVGASSEVYRI^n^OGRYXiN 
PLQTDAAENNVCD INS VHGLFATGT I EGRVECWDPRTRNRVGLL 
D\AP * TVSQQ1QR* TSLPT I SALKFN\ GALTMAVGTTTGQVLLY 
DIjRSDKPLIjVKDHQ YGLP I KS VHFQDSIiDL IliS ADSR I V KM WNK 
NSGKI FTSLEPEHDl^VCLYPNSGMLLTANE^PKMGIYYI PVL 
GPAPRWCSFLDNLTEBLEENPESNE 


6684 


111 


527 


GLRGGTSRGRAGREPEFAAGVLCWAGFCQSPCPPGGRGREAPA 
PP \SGRRHA*RPA* WliGCaPGGDSGOKiil^ijGS/ 
ELPLDINIQEPRWEJQSTFIiGRARHFFTVTDPRNLIiLSGAQIiEAS 
RNIVQNYR 


6685 


258 

- 


1473 


KIiLGDNFEGFCNKFELSDSENGSNS * QS PIAFDRLFDPDPQKVL 
QGVIDMKNAVIGNNKQKANLIVLGAVPRLLYLLQQETSSTELKT 

LRCLRTIFTSP VTPEELLYTDATVI PHLMALLSRSRYTQEY I CQ 
IFSHCCKG PDHQTI LFNHGAVQNI AH1»I*TSI*S YKVRMQALKCFS 
VLAFENPQVSMTLVNVLVDGELLPQ I FVKMIjQRDKP I EMQI/TS A 
KCLTYMCRAGAI RTDDNC I VLKTLP CJLVRMCS KERKLEERVEGA 
ETLAYLIEPDVELQRIASITDHLIAMLiADYFKYPSSVSAITDI K 
RIiDHDLKHAHEIjROAAFKLYASIX5ANDEI3IRKKVSLGEGRPPVL 
TASRQGVTST 


6686 


310 


927 


DSVTFDDIAVDFTPKEWTLI^PTQRNLYRDVMIiENYKNIiATVGY 
QLFKPSLISWLEQEESRTVQRGDFQASEWKVQIiKTKELAIiQQDV 
LGEPTSSGIQMIGSHNGGEVSDVKQCGDVSSEHSCLKTHVRTQN 
SENTFECYLYGVDFIjTLHKKTSTGEQR5VFSHVWKKPSSLNPDV 
VCQKNRCTRKKKAF* LQkTLGKSFH* S IHT 


6687 


181 


915 


EAMLEAPYKKEEDEQQRKEVKKDYPSNTTSSTSNSGNETSGSST 
I GETS NRSRDRDRYRRRNS RS RS PGRQ CRHR S RS WDRRHG S E S R 
SRDHRREDRVHYRS PPLATGEPVDNLS PEERDARTVFCMQLAAR 
I RPRDL ED FFS AVG KVRDVR IIS DRNS RRS KG IAYVE FCE I QS V 
PLAIGLTGQRLLGVP I IVQAS QAE KNRLiAAMANNLQ KGNGG P MR 
LYVGSLHFNITEDMIiRGI FEPFGKV 


6688 


1025 


1 


AEVPNYPRVFHKCPDSCWRFKFQPIQLQPYII^LSFSSEKPPISF 
SEPGLPR/SATARMATAAAPPNSSIDLPSDSGMGFISPAGDSI.D 
LPSDGGTGFFSLAGDSSSTRLSSIiAFISFSI*SSVSVGSSAGTTS 
STSVGSWAAFTSSSSSSTNRDVAGIiDFSTVITSVSGSIiVPSRE 
VAVICGSKGAGASGSASCSSRAGKTTEATAASSMPSGTSSFSTC 
TMSELEELFSLFSPAPLLSKLFTSSGS IAI CCQDSGPSDTGRLS 
VC^MfLADSOTGKl^DCQEVVTVGDSGGI*TCPELSLGRM *MSIiI* 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 


Amino acid segment containing signal peptide 
(A=Alanine, C= Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H«Histidine, I=Isoleucine, K=Lysine, 
I»=Leucine, M=Methionine. N=Asparagine, 
P= Proline Q=K31ut amine, R**Arginine, 
S=Serine, T=Threonine , V=Valine, 
W= Tryptophan, Y -Tyrosine, X*= Unknown, +=Stop 
Codon, /"possible nucleotide deletion, 
\=possible nucleotide insertion) 








SSAVI PGYSSSSDSRLNTVPTVDLL.CPFQTKSST 


6689 


640 


1299 


SSSAS YATSATSISDTAFSGSL KliKHGULSALDSSSRTS * STSS 
AEDSTFRI CSPSVSDTSSDSSGSKDNVI.ILFSKVSI * SCFSLSS 
FFSDS IS FCFSSSSFCKR* FVSSKVSQNAIXSSRLSNG PGGSSK 
QRNS LTARQLAMSL* ATKF * RNACNPNCLS SKKSAL*I>SLNQRF 
GGSASRKPGNISFNSQKCSALSYCCNFVIKPREVSVSSENYPAF 


6690 


1 


442 


GTRGKMAATLGPLGSWQQ>n^RCIjSARIX3SRMLTJ»I JiLT.^3SGQGP 
QQVGAGOTFEYTJO^HSI^KPYQGVGTGSSSLWNLMGNAMVMTQ 
Yl RLTPDMQS KQGAL WNRVPCFLRDWELQVHFK I HGOG KKNlA H 
GDGLAIWYTKDRMQP 


6691 

• 


287 


1401 


I»KTETSEBKARRYKDRPSQIjNAVFQEQKKMIQAQESITIjEDVAV 
DFTWEEWQLLGAAQKDLYRDVMLENYSNLVAVG YQAS KPDALFK 
LEQGEQLWTIEDGIHSGACSDIWKVDHVLERLQSESLVNRRKPC 
HBHDAPENIVHCSKSQFLDGQNHDIFDIjRGKSIiKSNLTLVNQSK 
GYEIKNSVEFTGNGDS FTjHANH ERLHTAI KFPASQKLISTKSQF 
IS PKHQKTRKLEKHHVCSECGKAF I KKSWLTDHQVMHTGEKPHR 
C5LCEKAFSRKFMLTEHQRTHTGEKPYECPECGKAFI*KKSRLNI 
HQKTHTGEKPYI CSECGKGFIQKGNLI VHQRIHTGEKP Y I CNEC 
/G KG FI QKTCLIAJJQRFOTER 


6692 


178 


939 


W I KEGELS LW ERFCAN 1 1 KAGPMPKH IAF I MDGNRRYAKKCQVE 
RQEGHSCXSFNKLAETIjRWCLNIjGILEVIVYAFSIENFKR 
DGLMDLARQKFSRLMEEKE IOX2KHGVCIRVLGDLHLLPLDLQEL 
IAQAVQATKNYNKCFLNVCFAYTSRHE I SNAVREMAWGVEQGLL 
DPSDI SESLLDKCLYTNRS PHPDILIRTSGEVRLSDFLLWQTSH 
S CLVFQ P VLWPE YTFWNI.FEAI LQFQMNHS VLQK 


6693 


178 


939 


W I KEGELSLWERFCAN 1 1 KAGPMPKH IAFI MDGNRRYAKKCQVE 

RQEGHSQG FNKIAETLRW CXjNXjG I LEVTVYAFS IENFKRSKSEV 

IXSLMDIARQKFSRLMEEKEKIXJKHGVCIRVIiGDLHIjIjP 

IAQA VQATKNYNKCFItNVCFAYTS RH E I SNAVREMAWGVEQGIX 

DPSDISESIJ^KCLYTNRSPHPDII*IRTSGEVRI>SDFIiI*WQTSH 

SCLVFQPVLWPEYTFWNLFEAI LQFQMNHSVLQK 


6694 


292 


813 


SLLLHLAPPGAYTPSQPLSSVSTETASSVRRQAAESRQHELPVR 
EVHSIiGQILPQDGLTAEAGPPEAQDPWGSPGISLPAAHIGFAAA 
LAVGPSGCHTEP\FDEWPSLFTK5DAYAARDKSKLIQLGITHVV 
NAAAG KFQ VDTGAKFYRGM S LEYYG I EADDNP FFDLS VY FL P 


6695 


292 


813 


Slil^HIoA^PGAYTPSQPLSSVSTETASSVRRQAAESRQHELPVR 
E VHS LGQI IjPQDGLTAEAG P PEAQDP WGS PGISLPAAH I GFAAA 
LAVGPSGCHTEP \ FDEVWPSLFXiGDAYAARDKSKLI QIjGITHW 
NAAAGKFQVDTGAKFYRGMSLEYYGIEADDNPFFDLSVYFLP 


6696 


1 


782 


PRVRGRVGERWAFLSVPAAMSSEMEPTiTiIjAWSYFRRRKFQIiCAD 
LCTQMIjEKS PYDQAAW I IiKARALTEMVYI DE I DVDQEG IAEMML 

denai aqvprpgtslklpgtnqtggpsqavrpitqagrp itgfi* 
rpstqsgrpgtmeqairtprtaytarpitsssgrfvrlgtasml 
tspdgpfinlsri*nltkysqkpklakai*ieyi fhh end vktald 
laalstehsqykdwwwk/dqiekcyyrvgmyreaekqikss 


6697 


3 


782 


" ppijrtirriinsraiirpgsrkvmavvpaslsgqdvgsfayltikdr 
i pqii*tkvidtu!rhkseffekhgeegveaekkais llsklrne 
ijqtdkpfiplvekfvdtdiwnqyi^qqsllnesdgksrwfysp 
wllv\ecymyrriheai\iqsppidyfdvfkeskeqnfygsqes 

i ialcthlqqli rti edld \ enqlkde ffkllx3i si»wgei svdl 
sl\sggesssqntnvx^sledlkpfili*ndmehlwsli*snck 


6698 


668 


754 


VGSCACAGS CKCKECKCTS CKKSECRAFP 


6699 


325 


492 


EGELP/ PARRVIiPRAMTASAQ PRGRRPG VGVGVWTS CKHPRCV 
UiGKRKGSVGAGSFQLPGGHLEFGETWEEC^QRETWEEAAIillLK 
NVHFASVVNSFIEKENYHYVTII^KGEVDVTHDSEPKKVEPEK^ 
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SEQ 
ID 

NO: 


Predicted 

peg 3 rinm^ 

nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end ■ 
nucleotide | 
location 
corre sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
<A=»Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I»Isoleucine , K=I*ysine, 
I>=Leucine, M=Methionine , N=Asparagine, 
P=Prol ine , Q=Glutamine , R=Arginine , 
S=Serine, T=Threonine, v=Valine, 
W~Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








ESKRI I YNHAFFFQESKWSGGILQ 


6700 


1098 


1392 


TQCWRSSTPGMRTHFRTQP/RI^CGQGFSQQENGHCMDTNECIQ 
FPFVCPRDKPVCVNTVGSYR(^TNKKCSRGYEPNEDGTACVERT 
LbLGI.CNIJjGK 


6701 


2 


1485 


AAAG PRTRVRRAAAFEGQPSPSPGLG PTSDKAAAPRTPKRRRL.W 
RQRQ/HPAMLCYVTRPDAVI^EVEVEAKANGEDCL^QVCRRLGI 
r EVDY FGLQ FTG S KGESL WLNLRNR I S QQMDGLAPYRLKLRVKF 
FVEPHLI LQEQTRHIFFliHIKEAIilAGHU^CS PEQAVELSAULA 
Q/nCFGDYNO^TrAKYNYEELCAKEI»S S ATLNS I VAKHKELEGTS Q 
AS AE YQVLQTVS AMEN YG I EWHSVRDS EGQKLI* I GVGPEG I S I C 
KDDFS P INRIAYPWQMATQSGKNVYLTVTKESGNSI VLLFKM I 
STRAASGLYRAITETHAFYRCDTVTSAVMMQYSRDIjKGHLASIjF 
LNEKINIX^KKYVFDIKRTSKEVYDHARRALYNAGVVDLVSRN^ 
SPSHS PLKSSESSblNCSSCEGLSCQQTRVIjQEKIjRKLKEAMl»CM 
VCCEEEINSTFCPCGHTVCCESCAAQI*QVGESAAHFCLQPHLSL 

uJj iuoKaWVlJfiK j 


6702 


397 


1971 


PI^KFIiKIiDIiVNVLCLP^DVFLFYRTCFCSMGLGSSCHI^SLPK 
RAEAIiIiCSRKATVAHlDLVAVRMAEEQE FTQLCKLPAQPSH PHCV 
NNTYRS AQH S Q ALLRGLLALRDSG ILFDWLWEGRH I EAHR I L» 
lAASCDYFKGMFAGGLKEMEQEEVIilHGVS YNAMCQI LHFI YTS 
ELELSL-SNVQETLVAACQIiQIPEI IHFCCDFLMS WVDEENI L.DV 
YRIiAELFDLSRIiTEQIjDTYIIiKNFVAFSRTDKYRQLPI^KVYSIj 
LSSNRLEVSCETEVYEGALLYHYSIiEQVQADQISLHEPPKLLET 
VRFPLMEAEVlXlRLHDKLDPSPUiDTVAJSALMY^ 
SPQTELRSDFQC^GFGGIHSTPS\MSSATRPKYLNPLLGEWKH 
FTAS LAPRMSNQG IAVLNNFVYI>IGGDNNVQGFRAESRCWRYD P 
RHNRW FQI QSIfQQEHADLS VCWGRY I YAVAGRD YHNDIjNAVER 
YT) PATNSWA WAP LKREVYAHAGATLEGKMY I TCG RKGRIT 


6703 


45 


1244 


G VGPRAAAM PLELEIiCPGRWVGGQHP CF I IAEIGONHQGDLDVA 
KRMIRMAKECGADCAKFQKSELEFKFNRKALERPYTS KHSWGKT 
YGEHKRHLEFSHDQ YREIiQRYAEEVG I FFTASGMDEMAVEFLHE 

lnvpffkvgsgdtnnfpylektak/trgwhsvtj^dvcgvqlnde 
tsswdvlgrvrtske kvijm*vldysgrpmvissgmqsfidtmkq 
vyqrvkplnpnfcflqctsayplqpedvnlrviseyqklfpdip 
igysghetgiaisvaavalgakvlerh itldktwkgsdhsasiie 

PGEIAE1jVRSVRI*VERALGSPTKQI^PCEMACNEKIiGKSWAKV 

kipegtii*tmdmltvkvgepkgyppedifni»vgkkvlvtvebdd 

TIMEE 


6704 


82 


1007 


tmntrnrvvnsglgaspasrptrdpqdpsgrqgei^spvedqreg 
leaap kg p s re s whagqrrtsaytlxapn inrrnb i qr iaeqe 
lanlekwkeqnrakpvhlvprrlggsqsetevrqkqqiiqlmqsk 
ykqklkl^esvrikkeaeeaeiiqkmkaiqreksnkiieekkrlqe 

NliRREAFR E HQ Q Y KTAE FL/RQTEHR IARQ KCL»S KCCLW PT I LN 

mgqklglqXdslkaeenrklqkmkdeqhqks ellelkrqqqeqe 

RAKXHQTEJJRJ?VNNAFLDRXjOX3KSQPGGI*EQSGGOWMNSGNSW 
GI 


6705 


2 


786 


RLCRNSARVPCGMSASRSUGEGAGFIGPLRGPHPRAGGTGTSFT 

SYKRKGGIMSTIAAFYGGKSILITVATGFLGKELMEKLFRTSPD 

LKVIYILVRPKAGQTLQHRVFQILDSICLFEKVIEVRPNVHEKIR 

AI YADliNQNDFAISKEDMQELLS CTNI I FHCAATVRFDDTLRHA 

VQLNVTATRQLLLMASG^PKLEAFIHISTAYSNQ^^ 

PCP VE PKKI I DSLEW\LDDAI IDEIT PKLI RDW PNI YTYTK 


6706 


130 


531 


PTHSSSSHSQEMLGKLNMLRNDGilFCDITIRVQDKIFRAHKVVlj 
AACS DFFRTKLVGQAEDENKNVLDLHHVTVTG F I PLLE YAYTAT 
LSINTENI IDVI1AAASYMQMFSVASTCSEFMKSSII1WNTPNSQP 

EK 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corre sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corre sponding 
to fxrst 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A= Alanine, C= Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F- Phenyl alanine, G=Glycine, 
H=Histidine, I=sIsoleucine, K=Lysine, 
L= Lieu cane , M=Methiomne, N=Asparagine, 
P= Proline, Q=Glut amine, R=Arginine, 
S=Serine. T=Threonine, v=Valine, 
W=Tryptophan, Y= Tyrosine, X= Unknown , * =S top 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 


6707 


2233 


1343 


YVJSGIGYELQHFHWRKFHFEKKGPPSTCQBRLYBSRSRWPCIS* 
GMVWGWTAVNGS W * GGQLRCVCVCTSHSSDSTRSSQRAS KCHS 
FF I IiSQ * KT* S S WENWVFAKYS RI YS YGHS CS KGRGD * DFK*NV \ 
SQAR*SRFCGLQ^CGHCGIjDIin^GGSSPWTDKHSCVHKNIiLC 
NRRVFSLLCEGPGHCYQGAVCREACAAASPGLDSAAEPHRLCEH 
TD* LPK*GPGYIQHFHCDSNIIiCILYNISPNLFS YSF * GVARYA 
C*RCHWYFEWI^YNHCGDILVACL*RRQI,*SSQ 


6708 


11S . 


1729 


TVGS WSRSGRS PPVGRQLLLTGRGAQAAGS PQGGMALQVE LVPT 
GE I IRVVHPHRPCKLALGSDGVRVTMESALTARDR VGVQD FVhh 
ENFTS EAAFI ENLRRRFRENLI YTYIGPVLVSVNP YRDT iQI YSR 
QHMERYRGV S FYEEP PHLLAVADTVYRALRTERRDQAVM I S VES 
GAGKTDATKRLLQLYAETCPAPQRGGAVRDRLLQSNPVLEAFGN 
AKTLRNDNS S RFGKYMDVQFDFKGAPVGGKI LS YLLEKS RWHQ 
NHGERNFHI FYQU^EGGEEETLRRLGLERNPQSYLYLVKGQCAK 
VSS INDKSDWKWRKALTVIDFTEDEVEDLLS IAASVLHLGNXH 
FAANEESNAQVTT1^QIJCYTjTRLI*SVEGSTLREAIjTHRKI I AKG 
EELLSPLNLEQAAY/ARDALAKAVYSRTFTWLVGKINRSLASKDV 
ES PS WRSTTVLGLTiD I YGFEVFQHNS FEQFCINYCNEKLQQLFI 

ELTTjKSEQEEYEAEG iawepvqyfnnkiicdlveekfkgx i\si 
LDE\ECLRPGE 


6709 


3 


894 


pphehlfpsgergpfsflvsrrglgpgkmgkkgkkekkgrgaek 
taakmekkvs kr s rkeee dl eal iah FQTLDAKRTQT VE Li P CPP 
PSPRIiNASl*SVHPEKDEIiIIiFGGEYFNGQKTFI*YNEIjYVYNIRK 
DTWTKVDIPSPP PRRCAHQAVWPQGGGQLWVFGGEFAS PNGEQ 
FYHYKDLWVTiH LATKTW EQ VKS TG GPS GRSG HRMVAW KRQli I IiF 
GGFHESTRDY I YYNDVYAFNLDTFTWS KLSPSGTGPTPRSGCQ\ 
I PSI.PRAASSVYGGYSKQRVKKDVDKGTRHSDMF 


6710 


158 


980 


RHKMTNYRVES SSGRAARKMRLALMGPAFIAAIG Y IDPGNFATN 
I QAGAS FG YQ LbWVVVWANliMAM L I Q I LS AKLG I ATG KN LAEQI 
RDHYPRPWWF YWVQAB I IAMATDIAEFIGAAIGFKLILGVSLJj 
QGAVLTGIATFlil LMliQRRGQKPLEKVIGGLI*LFVAAAYI VELI 
FSQPWLAQLGKGMVX PSLPTS EAVFLAAGVL \GAT IMPHVI /YT 
WHSSLTQHLHGGSRQQRYSATKWDVAIAMTIAGFVNLAI MATAA 
SELNFYGHTGVA 


6711 


3 


347 


VTECKTMTCKMSQLERNI *TMINTLHHYSVKLGHPDTI,IHGEFK 
ELVRTDLHNILMXENKNDQAI *HI MEDLDTNAHMQI IFKELIML 
MAMLT W S YHDNMHDAD YG PG QQHRP G 


6712 


118 


578 


PHGQKRTRYPQVRAPGQQPQAQLAHALCLKQVFAKDKTFRPRKR 
FEPGTQRFELYKKAQASLKSGLDIiRSWRLPPGENIDDWIAVHV 
VDFFNRINLI YGTMAERCS * TSCP VMAGGPRYEYRWQDERQYRR 
PAKLSAPRYMALLMDWIESH 


6713 


2485 


3 


QARGS DS EDGEFEIQAEDDARARKLGPGRPLPTFPTS ECTSDVE 
PDTREMVRAQNKKKKKSGGFQSMGLSYPVFKGIMKKGYKVPTPI 
QRKTI PVllM)GKDVVAMAKTGSGKTAClrIjlj*Wr tK-LKinbAljlb 
ARAL I LS PTRELALQTLKFTKELGKFTG LKTALI IiGGDRMEDQ F 
AALHENPDI I IATPGRLVHVAVEMSIiKLQS VEYWFDEADRLFE 
MGFAEQLQEI IAR I . PGGHQTVL FS ATLP KLLVEPARAGLT EPVt, 
IRLDVDTKXKRQLKTS FFLVREDTKAAVIXHIJ^HNVVRPQDQTV 
VFVATKHHAEYLTELLTTQRVS CAHI YSALDPTARKI NLAKFTL 
GKCSTLI VTDIiAARGLDI PLJLDNVINYS FPAKGKIiFLiHRVGRVA 
RAGRSGTAYSLVAPDEIP YIJUDLHLFIjGRSIiTLARPIiKE PSGVA 
GVDGMIjGRVPQSVVDEEDSGI»QSTI^ASLEI^GLARVADNAQQQ 
YVRSRPAPSPES I KRAKENDLVGLGLHPLFSSRFEEEELQRLRl* 
VDSI KNYRSRATI FEINASSRDLCSQVMRAKRQKDRKAIARFQQ 
GQQGRQEQQEGPVGPAPSRPALQEKQPEKEEEERAGESVEDIFS 
EWGRXRQRSGPNRGAKRRREEARQRDQEFYIPYRPKDFDSERG 
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SEQ 

TO 

NO: 


Predicted 

nucleotide 
location 
corre sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 

nn r*1 eotide 

XI 1^ ^ V» «A» %A 

location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine . C=Cvsteine . D=Asoartic Acid. E= 
Glutamic Acid, F= Phenyl al a nine , G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine. 
P= Proline, Q=Glut amine, R=Arginine, 
SeSerine, T=Threonine, V*Valine, 
W= Tryptophan , Y= Tyrosine, X=»Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\»po3sible nucleotide insertion) 








LS I SGEGGAPEO^AAGAVIJDIi^DEAQr^TRGRQQLKWDRKKKR 
FVGOSGQEDKKKIKTESGRYT SSSYKRDIjYQKWKOKOKI D* S* I* 
GRRRG ILTRRRPR TEEVGEAR PLAQAGCI PG PHAPRHPI»QAESA 
LELKTKQQILKQRPJ^QKAALSLQRWWPQAALCPQ 


6714 


169 


1416 


NNCQELLPPPPAPMAHI PSGGAPAAGAAPMGPQYCVCKVEbSVS 
GONLIjDRDVTSKSDPFCTVTjFTENNGRWIEYDRTSTAINNTjNPAF 
SKKFVr^YHFEEVQKLKFALFEJQDKSSMRI^EHDFXGQFSCSLG 

T T V.<? <3 K K T TP PT „T .r .T.WT> FTP nCZKK T . T T T AAORT.R DNR VTTLSLAG 

RRLDKKDLFGKSDPFLEFYKPGDDGKWMLVHRTEVIKYTLDPVW 
KP FTVPLVS IjCDGDMEKP I QVMCYD YDNDGGHD F I GE FQTS VSQ 
MCEAPJDSATPLEFECINPKXQRKKKNYKNSGI I ILRSCKINRDYS 
FIJ)YIOK;CQLMFI^GIDFTASNGNPIJ3PSSIiHYrNPMGThrEYX, 
S AI WAVGQI I QDYDSDKMFPAXiG FGAQIjP PDVJ KVSHEFAI NFNP 
TNP FCSGVDG IAOAYSACLP 


6715 


32 

• 


493 


GPAGAESGSLHCIjPATVQAIAGAAHSPHGGQPPRRGPLIGSGMP 
GKPKHIjGVPNGRMVLAVSDGELSSTTGPQGQGEGRGSSLS ihsi> 
PSG PSS P FPTEEOP VAS WALS FEPJ^IiODPLGTjAYFTEFI>KKE FS 
AENVTFWKACERFQQI PASDT 


6716 


1 


176 


GAGGPAPRSFGS EEPRAAIjERDKMSARAAAAKS tameetai weq 
HTVTLHRVSI»CCSK 


6717 


115 


896 


LFAMSGFENLNTDFYQTS YS I DDQSQQSYDYGGSGGPYSKQ yag 

ydysqqgrfvppdmmqpqqpytgqi yqptqaytpaspqpfygnn 

pmvfclafgatlllagkiqfg yvygi saigclgmfcllnlmsmt 
gvs fgcvas vlg ycllpm i llss favi fslqgmvg 1 1 ltag i ig 
wcsfsaskifisalamegqqllvaypcallygvfaiilsvf 


6718 


290 


599 


KQSSTVPGTI LPSLKWHNSGI*CKFPETGGKMTTFKEGtiTFKDVA 
VI FTEEELGLLD P VQRNliYQD VMLEN FRN LLS VGHHP F KHDVFL 
LEKEKKLDIMKTATQ 


O / 13 


X 


O JL 


PTRPEEODREDGKCHKMEMNP ISGNLNCDPIAMSOGS SDHGCET 
DLDSDDDKIEKPNNFMKDSASQDNGLSRKISRKRVCSSDSDSSL 
QWKKSS KARTGIJ^ ITRRCAATAANKI KLMSDVEDVSI*ENVHT 
RSKNGRiCKPLHLACTTAKKKliSDCEGSVHCEVPSEQYACEGKPP 
DPDS EGSTKVIiSQALNG DS DS EDMLNS EHKHRHTNIHKIDAPSK 
RKSSSVTSSG 


6720 


3 


822 


HEVAEEAGGTVYPQRGTMPGTKRFQHVIETPEPGKWEIjTGYEAA 
VPITEKSNPI*TQDLDKADAENIVRIiLGQCIlAEIFQEEGQALSTY 
QRLYS ES I LTTMVQ VAGKVQEVLKE PDGG LWLS GGGT SGRMAF 
LMSVS FNQLMKGLGQKPLYTYLIAGGDRSWASREGTEDSALHG 
I EEL KKVAAG KKRV I VI G I SVGLSAPFVAGQMDCCMNNTAVFLP 
VLVGFNPVS^4ARHPFPPPRI1J^LTVFPSLRAPHYQITSLLFSM 
SWTLISE 


6721 


3 


822 


HEVAEEAGGTVYPQRGTMPGTKRFQHVIETPEPGKWEIjTGYEAA 
VPITEKSNPLTQDLDKADAENI VRLLGQCDAEI FQEEGQALSTY 
QRLYS ES ILTTMVQVAGKVQEVLKEPIX^LVVLSGGGTSGRMAF 
IJ4SVSFNQLMKGLGQKPLYTYLIAGGDRSWASREGTEDSALHG 
IEELKKVAAGKKRVIVIGISVGLSAPFVAGQMDCCMNNTAVF1.P 
VIjVGFNPVSMARHPFPPPRIIJtSLTVFPSLRAPHYOITSIiLFSM 
SWTLISE ! 


6722 


1 


390 


RS WS KRTWQALPP^AVL»FIjIaI*FI*CGTPQAADNMQAI YVALG EAVE 
L P CP S P S TLHGDEHLS W FCS PAAGS FTTL VAQ VQ VGR P APD PG K 
PGRES RLR1XGNYS L WLEGS KEEDAGRYW CAVLGQHHNYQNW 


6723 


173 


659 


VCQYCTARMAD FG I S AGQFVAWWD KS S P VEALKGLVDKLQALT 
GNEGRVSVENIKQLLQSAHKESSFDIILSGIiVPGSTTIjHSAEII* 
AE I AR I LRPGGCLFXiKE PVETAVDNNS KVKTAS KLCS ALTLSGL 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corr e sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


" Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A- Alanine , C-Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Histidine, I-Isoleucine, K-Lysine, 
L=I*eucine , M=Me thionine , N=Asparag ine , 
P=Proline, Q=Glutamine, R=Arginine, 
S= Serine, T=Threonine, V= Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
1 \=possible nucleotide insertion) 








| VEVKELpQRE pltpbevqs vrehlghesdni* 


6724 


173 


659 


I VCQY CTARMAD FG I SAGQFVAWWDKSS PVEALKGLVDKLQALT 
GNEGRVSVENI KQLLQSAHKESSFDI ILSGLVPGSTTLHSAEI L 
AE I ARI IJIPGGCLFLKEPVETAVDNNSKVKTASKIjCSAIjTIiSGI* 
VEVKELQRE PLTPEEVQS VREHLGHESDNL 


6725 


356 


722 


RRRTPPVI IlAT^lDDD^l^U*AIJlI^QEEWNIJQEAERDHAQESI*SI»VD 
ASWEIjVDPTPDLG^FVQFWDQPFWGQLKAWBV^ 
J ICSYEGKGGMCSIRI^EPLLKLRPRKDLVEVFFV 


| 6726 


98 


714 

« 


HI^lQMERKIiniREKEKEYEGKHNSI^DTDC^KNCKSTL 

G YLYITQKQTLTKYPDTFLEGI VNGKI LCPFDADGHYF IDRDGIi 

L FRHVLNFLRNG ELLLP EG FR ENQ LLkAQEAE F FQLKGLAE EVKS 

RWEKEQLTPRETTFLEITDNHDRSQGIiRIFCNAPDFISKIKSRI 

VLVSKSRIiDGFPEEFSISSNIIQFKYFIK 


6727 


1 


831 


FRGMGDERPHYYGKHGTPQKYDPTFKGPI YNRGCTDI I CCVFIiL 
IAI VGYVAVG I IAWTHGDPRKVI YPTDSRGE FCGQKGTKNENKP 
YLFYPNIVKCASPLVLIjEFQCPTPQI cvekcpdryi*tyx»narss 
RDFEYYKQFCVPGFKNNKGVAEVLRDGDCPAVIjIPSKPIiARRCF 
PAIHAYKGVLMVGNETTYEI>3HGSRKNITDLVEGAKKANGVLiEA 
RQLAMRIFEDYTVSWYWDI I SLGIAMAMSLLFI I LLRFLAG IMG 
j RGMIIMGIliVLGY 


6728 

• 


486 


935 


FC^SWLRSLADSSLSWKMFIjVGLTGGIASGKSSVIQVFQQLGCA 
VIDVD\n«ARHVVQPGYPAHRRIVEVFGTEVLI*ENGDrNRKVIiGI> 
L I FNQPDRR QLIJtfAI THPE I R KEMMKETFKYFLiRE PR TS PRGKK 
HVPSALKEADSLMRRDT 


6729 


259 


1191 


VGLTGAQSGRTASMGRI^RAVAGPAIiRRWIjIaljGTVTVGFliAQSV 1 
LAGVKKFDVPCGGRDCSGGCQCYPBKGGRGQPGPVGPQGYNGPP 
GLG<5FPGIjOGRKGDKGERGAPGVTG p kgdvgarg vsgfpgadg I 
PGHPGQGG P RGR PG YDGCN GTQGDSG PQGPPGS EGPTGPPG PQG 

pkgqkgepyalpkeerdryrgepgepglvgfqgppgrpghvgqm 
gpvgapgrpgp pgppgp kg qqgnrglg fyg vkgekgdvgqpgpn 
gipsdtlhp i iaptgvtfh pdqykgekgsegepg irgislkgbe 

GIM 


6730 


784 


1015 


NMVDYYEVLGLQRYASPEDIKKAYHKVAliKWHPDKNPENKEEAE 
RKFKEVAEAYBVLSNDEKRD I YDKYGTEGLNEF 


6731 


1 


446 


GIRKRIiHGAWPRVEVGCPWETRESEGVHLERPTS PLKNNDEGS 
LDX YAGLDS AVSDSASKS CV P SRNCLDL YEB I1»TE EGTAKEATY 

NDLQVEYGKCXJLQMlCEIJ^KKFKEIC/rQNFSLINENQSIjKKNlSA 
LI KTARVE INRKDE E I 


6732 


102 


1205 | 


GRWQRRPPPPSPPLWCLQPGGGSDPQQI.TQIjRHCLSHSPQDTPW 
AQRQVCYTAATTQAAAPATRNCLPDHSGHRPTPPR S HRHHRQEN 
LGSIKPSSRSTKATSTIMAGDGRRAEAVREGWGVYVTPRAPIRE 
GRGRLAPQNGGSSDAPAYRTPPSRQGRREVRFSDEPPEVYGDFE 

plvakers pvgkrtrlee frs dsakee vresayylrsrqrrqpr 
pqeteemktrrttrlqqqhseqpplqpspvmtrrglrdshssee 

DEASSQTDLSQTISKKTVRSIQEAPAVSEDLVIRIiRRPPIjRYPR 

ybatsvo^kvnfseegeteeddqdsshssvttvkarsrdsdesg 

DKTTRSSSQYIES FW 


6733 


613 


1311 


rscrqvgmrsrnqggesasdghiscpkpsi ignagekslsedak 
kkkkslsn^k^dwasgtvxrhiotsgecerktkkslelskedli 

QLLS I MEGELQ AREDVXHMLKTE KTKPEVLEAHYGSAE PEKVLR 
VI*HRDAI1*AQEKSIGEDVYEKPISELDRXiEEKQKETYRRMLEQIj 
LI*AEKCHRRrVYEI*ENEKHKHTDYt©nCSDDFTNIi 
LLEQEKAYQARKE 


6734 


189 


551 I 


SAAMFPVFSGCFQELQEKNKSLELVSFEBVAVHFTWEEWQDJbDD 
AQRTLYRDVMIjET YS SLVSLGHCI TKPEM I FKLEQGAE P W I VEE 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corr e spon d ing 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A>Alanine, C«* Cysteine , D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
HseHistidine, I=Isoleucine, K-Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline. Q=Glutaraine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X«Unknown # *«Stop 
Codon, ,/=possible nucleotide deletion, 
\ -possible nucleotide insertion) 








TLNLRLSGG S KKQVFS G I CHRS L VELQBVHLV 


6735 


280 


558 


KSRRAGVTKMSNPFLKQVFNKDKTFRPKRKFEPGTQRFELHKKA 
QAS I^AGLDI^RIJWQLP PGEDLNDWVAVHVTO I 
XDGCT 


6736 


195 


808 


MNYELNFKREMPN I KS LGLTNLNFLLKRLSS VLPLI TDYVYFEN 
SSSNPYLIRRIEEIJNKTASGNVEAKWCFYRIUUDISNTLIKI^ 
KHAKEIEBESETTVEADLTDKQKHQLKHRELFUSRQYESLPATH 
IRGKCSVALLNETESVLSYIJDKEDTFFYSLVYDPSLKTIT.LADKG 
E IRVGPR YQAD IPEWLLEGTFFCVFAVL 


6737 


ISO 


1209 


PVIMPI^FSPGDIVRPSCCVSSSPKIJUWAHSRIjESYRPDTDLS 
REDTGCNIiQH I S DREN I DDLNME FNPSDHPRAS TI FLS KS QTD V 
REKRKS LF1NHHPPGQ IARKYS S CSTI FLDDSTVSQPNLKYT I K 
CVAIiAIYYHIKNRDPIXSRMIiliDIFDENLHPLSKSEVPPDYDKHN 
PEOKOTYRFVRTTiF^AAOT .TAF.CATVTLVYTiKRT.T.TYAK TDI CP 

AN W KR I VLGA I LLASKVWDDQA VWNVD Y CQ I LKDITVEDMNEXJB 
RQFLELU3FNINVPSSVYAKY^FDI^SLARANN]^SFPI*EPLSRE 
RAHKLEAI S R LCEDKYKDLRRSAR1CRSAS ADNLTLPRWS PAI I S 


6738 


148 


653 


CAC^QPARAEVGAATALPVRWASGEMAPSGSLAVPIaAVLVIjLL 
WGAPWTHGRRSNVRVT TDPINWRRI1I1EGDWMT EPYAPWCP ACHNT» 
QPEWESFAEWGEDLEVNIAKVDVTEQPGIiSGRFI ITALPTI YHC 
KDGEFRRYQGPRTKKDFINFI5DKEWKSIEPVSSWF 


6739 


3 


631 


SWPDMAEEEVAKLEKHI^LJLjRQEYVK^ 

ANKESS S ES F I SRLLA IVADLYEQEQYSDLKI KVGDRH I SAHKF 
VLAARSDSWSIANLSSTKEI^LSDANPEVTMTMLRWIYTDELEF 
REDDVFIiTCLMKIANRFQLQLbREllCEKGVMSLVNVRNCIRFYQ 
TAEELNASTLMNYCAE 1 XASHWVSEVEGVNKAl* 


6740 


3 


631 


SWPDMAEEEVAKLEKHLMLLRQEYVKLQKKLAETEKRC^ 
ANKESSSESFlSRIjIiAIVADLYEQEQYSDLKIKVGDRHISAHKF 
VIiAARSDSWSLAN^^STKEI i Dl^DANPF^^^m^WIYTDELEF 
REDDVFLTELMKIJ^NRFQI^IjLRERCEKGVMSL 
TAEELNASTLMNYCAE 1 I ASHWVSEVEGVNKAL 


6741 


141 

• 


960 


PLTLPFSSRARAGHTMNTSPGTVGSDPVILATAGYDHTVRFWQA 
HSGICTRTVOHQDSQVNALEVTPDRSMIAAAVOPVSIjGYQHIRM 
YDLNSNNPNP 1 1 S YDGVNKNIASVG FHEDGRWMYTGGED CTARI 
WDLRSRNLQCQR I FQVNAP I NCVCLHPNQAELIVGDQSGA IH IW 
DLKTDHNEQLIPEPEVSITSAHIDPDASYMAAVNSTLVPFSCLL 

plaigi lqegefeslarivgllfiaco^ncyvwnltggigdevtq 
lipktkip 


6742 


141 


960 


PLTLPFSSRARAGHTMNTS pgtvgsdpvilatag ydhtvrfwqa 
HSGI CTRTVQHQDSQVNALEVTPDRSMIAAAVQP vslgyqh I RM 
YDLNSNNPNP IIS YDGVNKNIAS VGFHEIDGRWMYTGGEDCTARI 
WDLRSRNLQCQRI FQVNAP INCVCLHPNQAELI VGDQSGAIHIW 
DLKTDHNEQLIPEPEVSITSAHIDPDASYMAAVNSTLVPFSCLL 
PIAIGILQEGEFESIJu^JlGIiLFIiACG^NCYVWNLTGGIGDF/V*rQ 
LIPKTKIP 


6743 


1 


412 


MHSTQDKSLHLEGDPNPSAAPTSTCAPRKMPKRI S ISKQLASVK 
AIiRKCS DLE KA1ATTAL I FRNSSDSDGKLEKAIAKDLLQTQFRN 
FAEGQETKPKYREILSEXiDEHTEKKIiDFEDFMILLLS ITVMSDL 
LQNIR ' 


6744 


95 


1343 • 


RTPARNRCAGCE VLS RFSS PNKAS S FALQSAGGGLPAVRALRRD 
RQKVSTVG YGMDEVEQDQHEARLKELFDS FDTTGTGS LGQEELT 
DLCHMLS LEE VAP VLQQTLLQDNT J <GRVHFDQFKEALI LZLSRT 
LSNEEHFQEPDCSLBAQPKYVRGGKRYGRRSLPEFQESVEEFPE 
VTVIEPIJ)EEARPSHIPAGDCSEHWKTQRSEEYEAEGQLRFWNP 
DDLNASQSGSSPPQDWI EEKLQEVCEDLGITRDGHLNRKKLVS I 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
correspondi ng 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corre spending 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A= Alanine, CoCysteine, D-Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q^Glutaraine, R^Arginine, 
S=Serine, T«Threonine, V= Valine, 
WssTryptophan, Y=Tyrosine, X=Unknovn, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








CEQYGLQNVDGEMLEEVFHNLDPDGTMSVEDFFYGLFKNGKSLT 
PSASTPYRQLKRHI^MQSFDESGRRTTTSSAMTSTIGFRVFSCL 
DDGMGHASVERI LDTWQEEG I ENSQEI LKALDFGLDGNINLTEL 
TIiAIiENELLVTKNS IHQACI 


6745 


1 


588 


TFRDQGWAQRRRWLLGCASWESWEAAIAAGPGIjPSSTARQQNNP 
AAGTEC FAAVWARGTAMGSVL>S TDSGXSAPASATARALERRRDP 
ELPVTS FDCAVCI>EVLHQPVRTRCGHVFCRSCIATSIiKNNKWTC 
PYCRAYI.PSEGVPATDVAKRMKSEYKNCAECDTI>VCLSEMRAHI 
RTCQKYIDKYGPLQELEETA 


6746 


110 


492 


GATGAMAESAPARHRRKRRSTPLTSSTLPSQATEKSSYFQTTEI 
S LWTWAAI QAVEKKMESQAARLQSLEGRTGTAEKKLADCEKMA 
VEFGNQLEGKWAVI^TLI^EYGLLQRRIiBNVF^IiIiRNRN 


6747 


247 


484 


BAVTFKDVAWFTEEELGIxLDLAQRKIjYRDVMLENFRNLDSVGH 
QPFHRDTFHFLREEKFWMMD IATQREGNSVYAGVC 


6748 


201 


665 


MTTFKEAVTFKDVAW FTEEELGLLDPAQRKX)YRDVMLENFRNL 
LS VGNQP FHQDTFHFIjGKEKFWKMKTTSQREGNSGGKI Q I EMET 
VPEAGPHEEWSCQQIWEQIASDLTRSQNSIRNSSQFPKEGDVPC 
Q I EARLS I SXVQQXPYRCNECKQ 


6749 


95 


719 


RREVKGGIXJVCPRARGSPQSQQFPSCMGGEGIjQQSGEALDGAM 
SAGGPCPAAAGGGPGGASCSVGAPGGVSMFRWIjEVIiEKEFDKAP 
VDVDLIiLGEIDPDQAPITYEGRQKMTSI^SCFAQIjCHKJVQSVSQ 
INHKLEAQIiVDIjKSKKrETQAEKVrVIiEKEVHDQI*LQLHS IQLQL 
HAKTGQSADSGTIKAXLSGPSVEELERELXAN 


6750 


3 


428 


SCESRRPGAJG^WASGAiPRDTTGLGSEQPSGDVAQSNRATMGT 
TAPGPIHLLELCDQKIjMEFIiCNMDNKDIjVWLEEIQEEAERMFTR 
EFSKEPELMPKTPSQKNRRKKRRISYVQDENRDPIRRRbSRRKS 

RSSQLSSRR 


6751 


152 


1417 


PTKATEMAGAS VKVAVRVRPFNS REMSRDSKCI IQMSGSTTTIV 
NPKQPKETPKSFS FDYSYWSHTSPEDINYASQKQVYRD IGEEML 
QHAFEG YNVC I FA YGQTGAGKS YTMMGKQEKDOQG 1 1 PQLCEDL 
FSRINDTTNDNMSYSVEVSYMEIYCERVRDI.LNPra^NLRVRE 

HPLIXSPYVEDLSKLAVTSYNDIQDLMDSGNKARTVTVATN^ 

SRSHAVFNIIFTQKRHDAETNITTEKVSKISLVDIiAGSERADST 

GAKGTRLKEGAN I NKS LTTLG KV I S ALAEMD S G PNKNKKXKKTD 

FIPYRI)SVLTWIjIiRENI»GGNSRTAMVAAI^PADINYDETLSTLR 

YADRAKQIRCNAVINEDPNNKLIRELKDEVTRLRDIXYAOGLGD 

ITDMTNALVGMSPSSSI*SA1>SSRNV 


6752 


24 


1834 


RNCVPPI^CYRSRVKFHSDIKMQYSHHCEHLLERLNKQREAGFIj 
CDCTIVIGEFQFKAHR1TVIjASFSEYFGAJCY11STSE1JNVFI»DQSQ 
VKADGFQKXiLEF I YTGTLNLDS WNVKEIHQAADYLKVEBVVTKC 
KIKMEDFAFIANPS STE ISS ITGNIELNQQTCLIiTLRDYNNREK 
S EVS TDD I Q AN P KQG ALAKKS S QTKKK KKAFN S PKTGQNKTVQ Y 
PSDILENASVELFLDANKLPTPVVEQVAQINDNSELELTSVVEN 
TFP AQD I VHTVTVKRKRGKSQPNCALKEHSMSN I ASVKS PYEAE 
NSGEELDQRYSKAKPMCOTCGKVFSEASSIJIRHMRIHKGVKPYV 
CHLCGKAFTQCNQLKrHVRTHTGEKPYKCELCDKGFAQKCOIiVF 
HSRMHHGEEKPYTCCDVC3^IK2FATSSNLKIHARKHSGEKPYVCDR 

CGQRFAQAS TLTYKVRRHTGE KP YVCDT CG KAFAVS S S L I THS R 
KHTGEKPFIC^CGNSYTDIIOSnbK^KTK^SGADKTLDSSAED 
HTLiS EQDS I QKS PLS ETMDVKPSDMTI»PLAI*PLGTEDHHMLLPV 
TDTQSPTSDTLLRSTVNGYSEPQLI FLQQLY 


6753 


2 


1305 


" VPSLP Y PPQKWAHTE FTTSSDSETANGIAKPDP VMPGGEEKAS 
PFGIKLRRTNYSLRFNCDQQAEQKKKKRHSSTGDSADAGPPAAG 
SARGEKEMEGVAL.KHGPSLPQERKQAPSTRRDSAEPSSSRSVPV 
AHPGPPPASSOTPAPEHDKAANKMPLAQKPALAPKPTSQTPPAS 
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to first 
amino acid 
residue of 
amino acid 
sequence 


predicted end 
nucleotide 
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corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C*Cysteine, D^Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G^Glycine, 
H=Histidine, I=Isoleucine, K=s Lysine, 
L=Leucine, M=Methionine, N=*Asparagine , 
P= Proline, Q^Glut amine , R=Arg inine , 
S=Serine, T=Threonine , V^valine, 
W=Tryptophan , Y=Tyrosine, X= Unknown , *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








PLSKLSRPYLVELLSRRAGRPDPEPSEPSKEDQBSSDRRPPSPP " 
GPEERKGQKRDBEEEATBRKPAS PPLPATQQEKPSQTPEAGRKE 
KPMLQSRHSIJX5SKLTEKVETAQ plwi tlalqkqkgfreqqatr 
EE RKQ AREAKQAEKLS KENVSVS VQPGSSSVSRAGSLHKSTALP 
EEKRPETAVSRLERREQIJa<ANTLPTSVTVE IS YSS PAAPLVKE 
VSKRFSSPDDAPVSSEPAWLAIAKRKAKAWSDCPLIIK 


6754 

• 


2 


413 


FVRRRRRRI/3GPEVNTMSSLHKSRIADF^^ 

LS FSGI PCEGGLRCLCW KI LLNYL PLERASWTS I LAKQREL YAQ 

PLREMI I Q PG I AKANMG VS RE DVT FEDH P LN PNPDSRWNTY FKD 

NEVLL 


6755 


298 


1343 


PGLQLQVALEADWFLDMPGGRRGPSRQQIjSRSALPSIiOTLVGGG 
CGNGTGLRNRNGSAIGLPVPPITALITPGPVRHCQIPDLPVDGS 
LLFEFLFFI YLLVALFIQY INI YKTVWWYPYHHPASCTSLNFHIj 
ID YHLAAF I TVMLARRL VWAL I S EATKAG AAS M I HYMVL I S ARIj 
VLLTLCX5WVLCWTLVNLFRSHSV1^N^ 

DSRAHT.LLTDYNYWQHEAVEESASTVGGLAKSKDFLSLLLESL 
KEQFNHATPIPTHSCPLSPDLIRNEVECLKADFNHRIKEVLFNS 
LFSAYWAFLPLCFVKVSGYbTFMCFT^DLCVl^INWVFLV 


6756 

• 


180 


754 


I ERALGSLPLS I PVSWGSI^TIJCYQC^PI^PKVLLCQTRVQCHD 
LRSLQPQPPGLKQS FCLRVLGI^QTGATTPGIxRDLTCKELI ILTE 
REAQKRKKRKE KES GMALTQG PLT FRDVAI E FS QEEWKS LD P VQ 
KALYTOVMIjENYRNLVFIjGKDNFAIjEVTCI CPRVFIjY WE 
PFHYLTETEALLTHK 


6757 


2 


459 


NSRVEAPEAHSRESCXSSDAMRKHLSWWWIATVCMLnFSHLSAVQ 
TRGIKHRI KWNR KALPSTAQI TEAQVAENRPGAF I KQGRKLD ID 
FGAEGNRYYEANYWQ FPDG I HYNGCS EANVTKEAFVTGCINATQ 
AANQGEFQKPDNXLHQCVLW 


6758 


1 


1008 


ASGPELPGRRFRDRAPWLPARLI^GVIiAVVJVSI^ALGPGSFCRR 
RVPSLAQLGHSEAAPSPDDVRWSRVPDRCPEERDRAWPPPPPPS 
LPPS FRRNMANNS PALTGNS QPQHQAAAAAAQQQQQCGGGGATK 
PAVSGKQGNVXjPLWGNEKTMNLNPMI LTN I LSS P YFKVQLYE LK 
TYHEWDE IYFKVTHVEPWEKGSRKTAGQTGMCGGVRGVGTGG I 
VSTAFCXLYKIiFTLKLTRKQVMGLITHTDSPYIRALGFMYIRYT 
QPPTDLWDWFESFLDDEEDLDVKAGG^CVMTIGEMLRSFIjTICLE 
WFSTLFPRI PVPVQKNIDQQIKTRPRKI 


6759 


1 


513 


RKHNFHSLDGTSTRAFHPQTGLPLLSSPVPQRKTQSGCFDLDSS 
LLHLKSFSSRSPRPCLNIEDDPDIHEKPFLSSSAPPITSLSLLG 
NFEE^VLNYRFDPLGIVUGKl'AFAAGASGAFCPTHLTLPVEVS FY 
SVSDDKAPS PYMGVI TLESLGKRG YRVP PSGTIQWCVL 


6760 


239 


606 


VLS KKKGLSAEEKRTRMMEI FSETKDVFQLKDLEKIAPKEKG I T 
AMS VKEVLQS LVDDGMVDCBR IGTSNTYYWAFPSKALHARKHKLE 
VLESQLSEGSQKHASLQKS I EKAKIGRCETEERT 


6761 


29 


1733 


ERTLRGLREVAAPS D VADAAVS RRGRCCCCLHCTQTQ VAQDCP S 
S S S SVQRCELS LFQS LHTMTS K KL VN S VAGCADDALAGL VACNP 
NXjQLLQGHRVALRSDLDSIjKGRVALL^GGGSGHEPAHAGFIGKG 
MLTGVIAGAVFTS PAVGS I LAAI RAV AQ AGTVGTLL I VKNYTGD 
RLNFGLAREQARAEG I PVEMWIGDDSAFTVLKKAGRRGLCGTV 
L I HKVAGALAEAGVG LE E IAKQ VNWT KAMGTLG VS LS S CS VPG 
S KPTFFXS ADEVELGLG IHGEAGVRR I KMATADE IVKLMLDHMT 
NTTNASHVP VQ PGSS WMMVNNLGGLS FLELG I IADATVR S LEG 
RGVKXARALVGTFMSALEMPGISLTLLLVlDEPIJjKLIDAETTAA 
AWPNVAAVS I TGRKRS RVAP AE PQEAPDSTAAGGS AS KRMALVL 
ERVCSTLLGLEEHLNALDRAAGDGDCGTTHSRAARAIQEWLKEG 
PPPASPAQLLS KLS VLL L EKMGGS SGAL YGLFLTAAAQP LKAKT 
SLP AWSAAMDAGLEAMQKYGKAAPGDRTMLDS LWAAGQBL 
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SEQ 
ID 

NO: 


Predicted 
beginning 

uUCleOC lu6 

location 
corresponding 
to first 
amino acid 
residue of 

CUDJ. UP Cl -L\J 

sequence 


Predicted end 
nucleotide 

.XOCdClOQ 

corre spending 
to first 
amino acid 
residue of 
amino acid 


Amino acid segment containing signal peptide 
(A= Alanine, Cs*Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, r = Pnenylaianine, G=Giycmtt, 
H=Histldine, I=Isoleucine, K=Lysine, 
L= Leucine , M==Methionine , N=*Asparagine , 
Peproline, Q=Glutamine, R^Arginine, j 
S=Serine , T=Threonine , V= Valine , 
WsTryptophan, Y=Tyrosine, X=Unknown , *=Stop 

C*t~%rirvn /.it > c ■» V\ "1 gr% rnirl />a^- -> c3e> rial a>* > r"m 
WUUUK, / — £J(J£>±3-LX?Xt£ UULleullUc uclctXUll i 

\=poesible nucleotide insertion) 


6762 


3 


613 


ASTISWRLCVAGAEARRPVPVAGEUtAGGGAMWFMYLLSWLSLFI 

fY\r7\ pttt .'RTfTv a^t.wt jvffT.TPWvnraTCDT t vwminrcTairT Tl^ 
UvAM I iirt.VAH.vjb X I JLiAMJ-LCC I 1 ViilbKlllvIrllnro i>\ v 1jJ_vj 

LYVFERFPTSMIGVGLFTNIjVYFGLtjQTFPFIMLTSPNPI LSOT 
LVVVNHYLAFQFFARE Y YPFSEVLAYFTFCLW I IPPAFFVSLSA 
GBNVL PSTMQPGDDWSN YFTKGKRGK 


6763 


2 


760 


SGPDFPGRRFRGCCCVRPPAGAGMELGGKWDMNSAPRLVSETAE 
RKQEQKZTGTEAKAADSGAVGARRr IiljCIiYJX^Fl^Ijr \i V SMV V F 
LLSLHVKS LGAS PTVAG I VGS SYG I LQLFS STLVGCWS DWGRR 
SSLLACILLSALGYXLLGAATNVFLFV^ 

ALLSDWPEKBRPIiVIGHFNTASGVGFTLGPWGGYI^TELBDGF 
YLTAFICFLVFILNAGI»VWFFPRREAKPGSTE 


6764 


80 


438 


LKKMDTMMLSVRNLFEQLVRRVEILSEGNEVQFIQIAKDFEDFR 

KKWQRTDHBIK3XYKDI^KAETERSAI£>V^ 

RQRAEADCEKLERQIQLIREMLMCDTSGSIQ 


6765 


3 


S50 


ARYSRVDHFCRRRCRAVARAPRFIiLQFPSGPSRHFIAACVARWIj 
RGS VLVSEAIiSGSAKDGI VTEVAVGVKRGS DE LLSGS VLSS PMS 
NMSSMVVTANGNDSKKFXGEDKMDGAPSRVLHIRXLPGEVTETE I 
V1ALGLPFGKVTNXI^XGKWQAF1»EI*ATEEAA^ • 
PHLRNQ i 


£766 


1 


1287 


EGGSFKASLTWLWPLK3EWKLHCEVEVISRHLPAI/3L.RNRGKGVR \ 
AVLSIjCC2QTSRSQPPVRAFIiIjISTI*KDKRGTRYELRENIEQFFT j 
KFVDEGKATVRLXEPPVD ICLSKANSS SI»KG FLS AMRLAHRG CN j 
VDTPVSTLTPVKTS EFENFKTKMVITS KKDYPLS KNFP YS LEHD 
QTSYCGLVRVDMRMIXriiKSLRKLDLSHNHIKICLPATIGDLIHLQ 
ELNLNDNHLES FSVALCHSTLQKSLWS LDLS KNKI KALPVQFCQ 
LQEI*KN£JCuDDNE I*XQF PCKI GQL INIjRFIjSAAKNKIjPFIiPi»c.r 
RNI^LEYIJDIjFGNTFEQPKVIJ^IKI^APLTLI»ESSARTI1.HNR 

ipygshiipfhi,cqdldtakzcvck;rfclnsfi^ 

AHTvVIiVDNXjGGTEAP 1 1 SYr GSl>G<-XVNd£» JJ -L 


6767 


336 


919 


APMICLCSSDLQFRYKEAFLRDRGU3IGYCSVDDDPRMKHFLNV 
GRLQSDNEYKKDFAKSRSQFHSSTDQPGLU3AKRSGX3LASDVHY 

PPGSYKVEMARRAAFJLiANARGLGLQGAYRGAEAVEAGDHQSGEV 
NPDATEILHVKKKKALLI* 


6768 • 


2 


363 


PGSTISCYIiLSEGSI»PI,CMQVACGEEKHRAPTMKTIiRARFKKTE 
liRLSPTDLGSCPPCGPCPIPKPAARGRRQSQDWGKSDERIiQAV 
ENNDAPRVAAI>IARKGI»VPTKLDPBGKSAFHL 


6769 


284 


396 


MSTPDFSTAENNQEI*ANEVS CUCAMLTLMIjQAMGQAD 


6770 


1 


397 


QRNYQVI WSSTMAKLHDY YKDEWKKLMTEFNYNSVMQVPRVEK 
ITI^MGVGEAXADKKLLDNAAADLAAISGQKPLITKARKSVAGF 

k^rqgypigckvtlrgermwefferlitiavprirdfrglsaks 


6771 


3 


376 


APAGTLAMTGKSVKDVDRYQAVIiANIJjLEEDNKFCIADCQS KG PR 
WASWNTGVFIC1 RCAGIHRNIjGVHISRVKSVNLDQWTQEQI QCM 
QBMGNGKANRXjYEAYLPETFRRPQIDPYLiFWSNLEG 


6772 


1 


1400 


AAAFI*0^MTVNGFINTVITS1*\ERRYDLHSYQSGI*IASSYDIAA 
CLCLTFVSYFGGSG\HKPRWLGWGR\VLMGTGSLVFAL.PHFTAG 
P+ *GWKLDAGVRTCPANPR\ PVCAG\HTSGLSRYQLVFMLGQFL 
HGVGATPIjYTLGVTYIiDENVKSSCSPIYIAI fytaai lgpaag y 
LIGGALLNIYTEMGRRTFJLTTESPLVTVGAWWVGFIXjSGAAAFFT 

avpilgyprqlpgsqryavmraaemhqlkdssrgeasnpdfgkt 
irdlplsiwliljotptfiixcij^gateatlltgmstfspkfles 

QF5LSASEAATLFGYI*WPAGGGGTFLGGFFVNKLRLRGSAVI k 

FCLFCTWSIiIjG UjVFSLHCPS VPMAGVTAS YGGSLL PEGHLNIi 
TAPCHAACSCQPEHYSPVCGSDGUfxFSLCHAGCPAATETNVTC 

QKVYRDCSCIPQNLSSGFGHATAGKCTST 
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Predicted 
beginning 
nucleotide 
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corre spondi ng 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
, corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine , G=Glycine, 
H=Histidine, I=Isoleucine, K= Lysine , 
L=Leucine, M=Meth±onine, NsAsparagine, 
P=Proline, Q=Glut amine, R^Arginine, 
S=Serine, T=Threonine, V=Valine, 
W»Tryptophan, Y=Tyrosine, X=Unknown, *~Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 


6773 


1 


630 


P WEAP KE H JCYKAEEHTVVLT VTG E P CH FP FQ Y HRQ L YH KCTHKG 
R PG PQP W CATTPNFDQDQRWGY CLEP KKVKDHCSKHS P CQKGGT 
CVNMPSGPHCLCPQHLTGNHCQKEKCFEPQLIiRFFHKNEIWYRT 
E QAAVARCQ C KG P D AH CQRLAS Q ACRTN P CLHGGR CLE VEGHRL 
CHCPVGYTGPFCDVGE* GSGASRRPAPRWDGLAR 


6774 


146 


389 


LTELSDQQYFLFFILSS/WVPTFLSMDVDGRVIKADSFSKI I SS 
GLRIGFLTGPKPLIERVILHIQVSTLHPSTFNQLMISQ 


6775 


104 


614 


TCPSQLRVLTARGGRRAPSPQLWTLVIALIEEKWRSHRILRMNS 
GRPETMENLPALYTI FQGEVAMVTDYGAFIKI PGCRKQGL.VHRT 
HMSS CRVDKPS E I VDVGDKVWVKLIGREMKNDRI KVS I>S MKWN 
QGTGKDLDPNNV\SLSKKRGGGDPSRITLGRRSPLRLS 


6776 


3 


1108 


HERHERHEGALSQDALLRIS I PLDSNMRPEKCRRFVHPQWQLLH 
LNGTFPKTSDAEMEPCVDGKVYORISFSSTIVTEWDIiVCDSQSL 

; TS VAKFVFMAGMMVGG I LGGHLS DRFGRR FVLRWC YLQVAI VGT 
CAALAP-rFLIYCSLRFLSGIAAMSLITNTIMLIAEWATHRFQAM 

t GITLGMCPSG I AFMTLAGLAFAI RDWH I LQL WS VP YFVI FLTS 
SWLLESARWLI INNKPEEGLKELRKAAHRSGMKNARDTLTLE1L 
KSTMXKELEAAQKKKPFIjGERIjHMPNICKRISIiLPFTKFANFKA 
Y FGLNLHG / L KHLGNNVFLI^TLF<3AV/ TP PGQLVLHLGHWGSG 
RVS S RGRVN CLGLFVLQ VW 


6777 


779 


63 

■ 


CFFHGPAWRBCEVRATFAKKQGQSGI ISCIAFS PAQPLYACGS Y 
GRSLGLYAWDDGSPLALLGGHOCGITHLCFHPDGNRFFSGARKD 
AELLCWDLRQSGYPLWSLGREVTTNQRI yfdldptgqflvsgst 
SGAVSVWIXPDGPGOT>GKPEPVLSFLPQKDCTNGVSLHPSLPLLG 
HCLPVS VCFLS PTESGGRRRGAG PS LGS PRRHVHLECRLQLWWC 
GGGARLQHP+ * SPRARKGR 


6778 


311 


805 


IQS ITDESRGS IRRKNPANTRIiRIiNVP\BBTAGDSE/ERSPEEB 
VQADPRIRSASPKCPTSSPFPKGRSPEGEGET\DPEKVHFHPGP 
KDKSVAEKN \KGP\S PVSSEGI KDFFSMKPEWENLNQSNTVRRMH 
T\AVRLNEVIVXKSRDAKLVLLNMPGPPRNRNGDENY 


6779 


2 


535 


RALRRQPRLLAANGIEPESMAISEPIKGSRKPCVNKEELALKKP 
MAKCAWKG PREPPQDARAE AES PGGASESDQDGGHESPPKKKAV 
AWSAKNPAPMRKKKKVSLGPVSYVLVDSEIX3RKKPVMPKKGPG 
SRREASDQKAPRGQQPAEATASTSRGPKAKPEGSPRRATNESRK 
V 


67B0 


3 


403 

* 


HE VNDNKPE ININLMS PGKEEIS YI FEGDP IDTFVALVRVQDKD 
SGLNGE I VCKLHGHGHFKI^KTYFJINYLI LTNATLDREKRS E YS 
LTVIAEDRGTPSLSTVKHFTVQINDINDNPPHFQRSRYEFVISE 
K 


6731 


1 


1269 


APTRPVFPTLQDLSSSKEPSNSLNLPHSNELCSSLVHFELSEVS 
SNVAPS IPPVMSRPVSSSS ISTPLPPNQITVFVTSNPITTSANT 
SAALPTHLOSALMSTVVTMPNAGSK^MVSEGQSAAQSNARPQFI 
if VriNbSS 1 1 Q VMKGSQ PSTI PAAPLi l WSGLM P PSVAWG PL 
HIPQNIKFSSAPVPPNALSSSPAPNIQTGRPLVLSSRATPVQLP 
SPPOTSSPWPSHPPV0X3VKEr^DEASPQVOTSADQNTLPSSQ 
STTMVS PLLTNS PGSSGNRRS PVS SSKGKGKVDKIGQILLTKAC 
KiCVTGSLEKGEEQYGADGETEGQGLDTTAPGLMGTEQLSTELDS 
KTPTPP APTLLKMTSS PVGPGTASAGPS LPGGALPTS VRS 1VTT 
L VP S EL I SA VPTT KSNHGG I ASESLAG 


6732 


3 


1327 


RKPTVIRIPAKPGKCIiHEDPQSPPPLPAEKPIGNTFSTVSGKLS 
NVERTRNLESNHPGQTGGFVRVPPRLPPRPVNGKTI PTQQPPTK 
VPPERPPPPKLSATRRSNKKLPFNRSSSDMDLQKKQSNLATGLS 
KAKSQVFKNQDPVLPPRPKPGHPLYSKYMLSVPHGIANEDIVSQ 
NPGELS CKRGDVLVl^KQTENlTyLECQKGEXnX?RVHLSQE4KL I T 
PLDEHLRS R PNPFS PPKAPSHAQKPVDSGAPHAWLHDFPAEQV 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 

• 

sequence 


Predicted end 
nucleotide 
locat ion 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C-Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I^Isoleucine, K= Lysine, 
L= Leu cine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T^Threonine, V»Valine, 
W -Tryp t ophan , Y=Tyrooine, X =• Unknown, *»Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








DDLNLTSGE IVYLUBKIDTDW YRGNCRNQIG I FPANYVKVI I DI 
PEGGNGKRECVSSHCVKGSRCVARFEYIGEQKDELS FSEGEI 1 1 
LKEYVNEEWARGEVRGRTG I PPLNFVEPVEDYPTSGANVLSTKV 
P LKT KKEDSGSNSQ VNS LPAE W CEALKS FTAETS DDLS FKRGDR 
I 


6783 


3 


1750 


SYHHHHAQQSAAAS PITTAS QKTVTTTSMITTKTLPLVIjXAATA 
TMPAS WGQRPT I AMVTA INS Q KAVLS TDVQNTP VNLQTS S KVT 
GPGAEAVQIVAKNTVTLQVQATPPQPI KVPQFI P P PRLTPRPNF 
LPQVRPKPVAQNNI PXAPAPPPMIAAPQLIQRPVMLTKFTPTTL 
PTSQNS I H P VR WNGQTAT I AKTFP MAQLTS IV I AT PGTRLAG P 
QTVQLSKPSLEKQTVKSHTETDEXQTESRTITPPAAPKPKREEN 
PQKLAPMVSLGLVTHDHLEE I QS KRQERKRRTTANP VYSGAVFE 
PERKKSAVTYLNSTMHPGTRKRGRPPKYNAVLGFGALTPTS PQS 
SHPDS PENEKTETTFTFPAPVQP VSLPSPTSTDGDIHEDFCSVC 
RKSGQLLMCDTCSRVYHLDCIJ^PPIiKTIPKGMWICPRCaDQMLK 
KEEAI PWPGTLAIVHSYI AYKAAKEEEKQKLLKMSSDLKQEREQ 
LEQKVKQLSNS I S KCMEMKNT I LARQKEMHSSLE KVXQL IRLIH 
GI DLSKPVDSEATVGAI SNGPDCTP PANAATSTP APS PSSQS CT 
ANCNQGEETK . 


6784 


3 

• 


1750 

• 


SYHHHHAQQSAAAS PNLTAS Q KTVTTTS M I TTKTL P LVLKAATA 
TMPAS WGORPT IAMVTAINSOKAVLSTDVONTP VNLOTS SKVT 
GPGAEAVQIVAKNTVTLQVQATPPQPIKVPQFIPPPRLTPRPNF 
LPQVRPKPVAQNNI PIAPAPPPMLAAPQLIQRPVMIiTKFTPTTL 
PTSQNS IHPVRVVNGQTATI AKTFPMAQLTSIVIATPGTRIAGP 
QTVQLS KPSI*EKQTVKSHTETDE KQTES RTITPPAAPKP KREEN 
PQKLAFMVSLGLVTHDHT.RE I QS KRQERKRRTTANP VYSGAVFE 
PERXKSAVTYLNSTMHPGTRKRGRP P KYNAVLGFGADTPTS PQS 
SHPDS P ENEKTETTFTFPAP VQP VSLPS PTSTDGD I HEDFCSVC 
RKSGQLLMCDTCSRVYHLDCLDPPIiKT I PKGMW1 CPRCQDQMLK 
KEEAIPWPGTIAIVHSYIAYKAAKEEEKQKLLKWSSDIjKQEREQ 
LEQKVKQI^NSISKC>IEMKNTILARQKEMHSSLEKVKQLIRLIH 
GIDI^KPVDSEATVGAISNGPDCTPPANAATSTPAPSPSSQSCT 
ANCNQGEETK 


6785 


1 


528 


LGNTVLH YCSMYSKPECLKLLIiRS KPT VD I VNQAGE TAXjD I AKR 
LKATQCEDLI>SQAKSGKFNPHVHVEYEWNLRQEEIDESDDDLDD 
KPSPVKKERSPRPQSFCHSSS I S PQDKLALPGFSTPRDKQRIiS Y 
GAFTNQI FVSTSTDSPTSPTTEAPPLPPRNAGKGPTGPPITPHR 


6786 


1820 


1397 


RSPKVLVIAPTRELANHVSRDFKDI \TRXLTVARFYGGTS YQSQ 
INHIRNGIDILVGTPGRIKDHLQSGRLDl^KLPJTVAnODEVDQML 
DLGFAEQVEDI IHES YKTDSEDNPQTI*LFSATCPQWVYTVA\KK 
YMKS R YEQVDLDG KMTQKAATTVEHLA I QCHWS QRPAVI GDVLQ 
VYSGS EGRAI I FCE TKKNVTEMAMNPH 1 KQNAQC2LHGD IAQSQR 
E I TLKGFREGS FKVLVATNVAARGLDI PEVDLVIQS S P PQDVES 
YIHRSGRTGRAGRTGI CICFYQPRERGQLRYVEQKAGITFKRVG 
VPSTMDLVKS KSMDAIRSLASVSYAAVDFFRPSAQRLIEEKGAV 
DALAAALAH I SGAS S FEPRSL I TSDKGFVTMTLE SLE E IQDVS C 
AWKEI^RKLSSNAVSQITRMCLLKGNMGVCFDVPTTESERLQAE 
WHDSDWILSVPAKLPEIEEYYDGNTSSNSRQRSGWSSGRSGRSG 
RSGGRSGGRSGRQSRQGSRSGSRQDGRRRSGNRNRSRSGGHKRS 
FD * VP YHIiVDFLSDFLVDSVYIjTGRQ I DHLTGLTGL I DHLTSHS 
SVWN 


6787 


2646 


2270 


PS S FPKNVPLEELE E PP K* KRSGLGS LTPKSQIQNG P * PQTFF F 
FELGS PSGVI S AHCNLRLLGS S DS P APASRVAG I IGTCHHAWLI 
L VFXi VBMG FHHVGQ AGLKLLTL \VIHPPWPP KVLGLQT 


6788 


16 


936 


GGTVDLR \DMLAVS VLAAVRGGR/ ATVRRVRESNVLHEKS KGKT 
REGAEDKMTSGDVLSNRKMFYLLKTAFPSVQINTEEKTO\ELDQ j 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A~ Alanine, OCysteine, D=Aspartic Acid, E= 
Glutamic Acid, F» Phenylalanine , G=Glycine, 
H=Histidine, I=Isoleucine , K= Lysine, 
L=Leucine, M=Methionine, • N=Asparagine r 
P=Proline, Q-Glutaroine, R=Arginine, 
S=Serine, T= Threonine, V=Valine, 
W=Tryp t ophan , Y-Tyrosine, X=Unknown, *=»Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








E VI LVJG S * DS * G YPKGK* LI>P KEVPS R / RVLLSGLTPLDATQ;E\ 
FTEDLS K\ YVTTMVCVAVNG KPMLGVI HKP FSBYTAWAMVDGGS 
NVKARSSYNE KTPRI WSRSHSGMVKQVALQTFGNQTTI I PAGG 
AG YTCVliALLD VPDKSQEKADLYIHVT YT KKWD I CAGNAI LKALG 
GHMTTLSGEBISYTGSDGIEGGIXASIRMNHQAIjVRKIxPDIiEKT 
GHK 


6789 


2 


678 


GNG INVLKI APESAI KFMA YEQ I KRLV W * * PGDS * G F/ YERLVA 
GSLAGAI AQSS I Y PMEVLKTRMALRKTGQ YSGMLDCARR ILARE 
GVAAFYKG YVPNMLG 1 1 P YAG IDLAVYETLKNAWLQHYAVNSAD 
PGVFVLLACGT^STCGQIJ^YPLALVRTRMQAQAS IEGAPEVT 
MSSLFKHII^TEGAFGLYRGIAPNFMKVIPAVSISYVVYENLKI 
TLGVQSR 


6790 

* 


2 


4068 


APPAGRRRMQAAPRAGCGAALliWIVSSCLCRAWTAPSTSQKCD 
EPLVSGLPHVAFSSSSS ISGS YSPGYAKINKRGGAGGWS psdsd 
HYQWLQVDFGNRKQ I SAIATQGR YSS SDWVTQYRMLYSDTGRNW 
KPYHQDGNI WAFPGNINSDGWRHELQHP I IARYVRIVPLDWNG 
EGRIGLRIEVYGCSYWADVINFTCHVVIjPYRFRNKKMKTLKDVI 
ALNFKTSESEGVIIJIGEGG^^YITLBI*KKAKLVI^LNLGSNQL 
G P I YGHTS VMTG S LLDDHHWHS WI E RQGRS I NLT LDRSMQH FR 
TNGEFDYLDLDYE I TFGG I P FSGKPSSS S RKNFKGCMES INYNG 
WITDLARRKKLEPSNVGNLSFSCVEPYTVPVFFNATSYLEVPG 
RLNQDLFSVSFQFRTWNPNGLLVFSHFADNLGNVEIDLTESKVG 
VHINlTQTKMSQIDISSGSGLNDGQWHEVRFliAKENFAILTIDG 
DEASAVRTNS PLQVJCTGEKYF FGG FLNQMNNS SHS VLQPS FQGC 
MQLIQVDDQLVNLYEVAQRKPGSFANVS IDMCAI IDRCVPNHCE 
HGGKCSQTWDS FKCTCDETG YSGATCHNS I YEPSCEAYKHLGQT 
SNYYWIDPDGSGPI^PLKVYCNMTEDKVWTIVSHDLQMQTPWG 
YNPEK^SVTQLVYSASMDQISAITDSAEYCEQYVSYFCKMSRLL 
NTPIX3SPYTWWVGKANEKHYYWGGSGPGIQKCACGIERNCTDPK 
YYCNCDAD Y KQWRXDAG FLS YKDHLPVS Q WVGDTDRQG S EAKL 
SVG PLRCOGDRNYWNAASFPNPS SYIjHFST FOGETSADI SFYFK 
TLTPWGVFLENMGKEBFIKLELKSATEVSFS FDVGNGPVEIWR 
S PTPLNDDQWHRVTAERNVKQAS LQVDRLPQQ I RKAPTEGHTRL 
E L Y SQL FVG G AGGQQGFLGCIRS LRMNGVTLDLEERAKVTSG F I 
SGCSGHCTSYGTNCENGGKCLERYHGYSCDCSNTAYDGTFCNKD 
VGAFFEEGMWLRYNFQAPATNARDSSSRVDNAPDQQNSHPDLAQ 
EEIRFSFSTTKAPCILLYISSFTTDFXiAVLVXPTGSLQIRYNLG 
GTREPYNIDVDHRNMANGQPHSVNITRHEKTrFLKLDHYPSVSY 
HLPSSSDTLFTSISPKSLFLGKVIETGKXDQEIHKYNTPGFTGCLS 
RVQFNQIAPLKAALRQTNASAHVHIQGELVESNCX1ASPLTLSPM 
SSATDPWHLDHIJ5SASADFPYNPGQGQAIRNGVNRNSAI IGGVI 
A\W1 FTPSLCTP\VLP * SR *HVS PHKGTLP I PNEAKGAGSRQK 
KPGRRPSMNNDPPTSQRP IDESKKEWPIDkRGG YLAMG 


6791 


1601 


1193 


TGHEGAKGEKGDKGDLGPRGERGQHG PKGEKG YPG IP PEL/ PG W 
SAW* SWLTAASTKVQAILLPQPLE * LGLQIAFMAS LATHFSNQ 
NSGI I FSSVETNIGNFFDVMTGRFGAPVSGVYFF^FSMMKHEDV 
EEVYVYIJ4HNGITIVFSMYSYEMXGKSOT^ 
LRMGNGALHGDHQRFSTFAGFLLFETK 


6792 


33 


1073 


VRHTNWGVDMYLFSLGSESPKGAI GHI VSTEKT I LAVERKKVLL 
PPLWNRTFS WGFDDFSCC1CS YGSDKVLJ^FF^IuAA WGR CLCAV 
CPS PTTIVTSGTSTWCVWFJ^MTKGRPRGIiRLRC^VLYGHTQAV 
TCLAASVTFSUjVSGSQDCTCILWDLDHLTHVTRLPAHREGISA 

iti sdvsgt ivscagahlslwnvngqplasittawgpegaitcc 
clmegpawdtsqi iitgsqdgmvrvwkt/vgced vcswtasrrg 
apgsaskpkrpqvgefjk3lesragr*hcfdreacx2nqp\pvtal 
avsrnhtkllvgdergri fcwsadg* eergsrgsgttvpg 
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SEQ 
ID 

NO: 


Predicted 

beginning 

nucleotide 

location 

cor re sponding 

to first 

amino acid 

residue of 

amino acid 

sequence 


Predicted end 
nucleotide 
location 
cor re 3ponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F« Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K= Lysine, 
I*=Leucine , M=Methionine , N=Asparagine , 
P= Proline, Q=Glut amine, R=Arginine, 
S=Serine, T= Threonine , V=valine, 
W=Tryptophan, Y= Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 


6793 


2340 


805 


GRKEANY\YGSIjTQAGTVSLGIiDA^ 

LVFIXSWDISSLm^AEAMRRAKVI^WGIiQEQLWPHMEAIiRPRPSV 
Y I PEFIAANOSARADNIiI PGSRAQQLEQIRRD IRDFRSS AGLDK 
VIVLWTANTERFCEVTPGLNDTAENLLRT IBLGLEVS PSTLFAV 
AS I LEGCAFLNG S PQNTLVPGALELAWQHR VFVGGD D F KS G QTK 
VKSVLVDFL IGSGLKTMS I VSYNHLGNNDGBNLSAPLQFRS KEV 
SKSNWDDMVQSNPVLYTPGEBPDHCWI KYVP YVGDS KRALDE 
YTSELMU3GTNTX»VIiHNTCEDSLIiAAP IMLDLALLTELCQRVS F 
CTDMDPEPQTFHPVLSLLSFIjFKAPIjVPPGSPVVNAIiFRQRSCI 
EN I LRACVGLP PQNH^^LEHKMERPGPSI»KRVGP VAATY PMLNK 
KGPVPAATNGCTGDANGHT iQBBPPM PTT * GPGHTVSRLFLP AAP 
HDPTLKAPTNKGRCHFSPPSTWGSWGL 


6794 
_ 1 


169 


1349 


DDVKRKPEASAH * EKPGP PSRPGVRGGRERAGGRGSHGARS CR \ 
EPAPPAPAPPEDHPDEEMGFTIDI KS FLKPGEKTYTQRCRLFVG 
NLPTDITEEDFKRLFEKYGEPSEVFINRDRGFGFIRLESRTLiAE 
IAKAELDGTI LKSRPLRI RFATHGAALTVKNLS P WSNELLEQA 
FSQFGP VEKAVWVDDRGRATGKGFVEFAAKP PARKALERCGDG 
AFIiLTTTPRPVIVEPMEQFDDEDGLPEKLMQKTQQYHKEREQPP 
RFAQPGTFE FE YASRWKAIJ5EMEKQQREQVDRNIREAKEKLE7AE 
MEAARHEHQLMLMRQDlJtfRR 

HEEEHRRREJ^MIRHRE<JEELRRQQEGFXr>NYMENYVCHFLR 


6795 


1740 


1010 


G PRRQTQ VRD1 I ELDS F*DWAAQETDCAQNSGERL* KGV/LENFS 
TMS KSAVKISLDLLSNPLCE QDQDIiIjNMVIT^LDTAMKRMDAFNG 
EKVNQI QKTVT E PLKKFGS VFPS LNMAVKRREQALQD YRRLQAK 
VEKYEEKEKTGPVLAKLHQAREELRPVREDFEAraiRQIJjEEMPR 
FYGSRLDYFQPSFESIjIRAQWYYSEMHKIFGDLSHQLDQPGHS 
DEQRERENEAKLS ELRALS IVADD 


6796 


48 


683 


GKE I QI PTI KLAWIiI*FG1»E * PVGALG KG WS P* * SHVALGQ LGW 
DTRAVRSS WRWELCVS AQEWSQRS A* S S PSP VGACPSLNPPET 
SVQEGRDCWQR* LPRLFSALVGQPGCWPQGAPPERCV* PGRCKW 
HLQSQVLR* ERKRCCRCLPRFA* GWRRRHQRLGLG IHPAPLGST 
SPPHPEGNSQQCRR*GWAAELRLPSSWL*GKLGC* 


6797 


1620 


211 


TERMTPSQPTRGSSCTRFSSMIiWTSTVIRCLTCHWAGKRMSWGV 
TLGPMAO^IiSASGTTTEATWTRPTTHLTLIRWWLIiTASRVDPP 
ERPPPPPSDDLTIAESSSSYKNL/DAQIPQ/DWSMSPSTSG*RP 
LTSRASS IMRSRTAI PSAS *SRLTTKHTVGGSPSAWRPRPTSRS 
VSTPVSSSTE^TTASGSCIiTWWSSSPAPCPSSSAPAHSFEASCCK 
TSIjWGSCX3GSGDGSSAOGSGWNI»SMAGTS CSSPAMCSPSRAPS * 
RS ASRPRTWRATTS AASS WAPRRCW CGWA* SAT* PSSTTTI S SS 
PHCGWPCPASCASAAAWLSSTWATAS VAGSCWGP IM ♦ SSAHSPW 
CLSACSRSSMGTTCI**RSPP\SGASRAAAAWCGSSPSSTFTPSS 
ASSSTWCS ASSSRSSPAPTTPSS I PAAQAQRRASCRPTSHSART 
APPPAS SAAGAARPAAFSAAAEGTPRRS I RCW 


6798 


3894 


1696 


STISWESLESWlJ«KATNPSNRQEDWEYIIGFCDQINKEIiEG*VS 
AltWGQLRGSGLGRGTTMAKEGQPGS PRLSALECVLLVPQ\ PQ I A 
VRLLAHKI QSPQEWEAI^ALTYLGDRVS EKVKTKV I ELLYSWTM 
AI>PEEAKIKDAYHMIiKRQGIVQSDPPIPVDRTLIPSPPPRPKNP 
VF13DEEKSKLLAXLLKSKNPDDLQEANKLIKSMVREDEARIQKV 

TKRLHTLEEVN2*NVRIJ>SEMTJ>HYSQEI>SSTC^ 

ENKRRTLFKLASF/TEDNDNSIX3DIIjG^SDNI»SRVINSYKTIIEG 

QVINGEVATLTLPDSEGNSQCSNTOlXIPLAEI^TNSIiSSVIA 

PAPTPPSSGIPlLPPPPQASGPPRSRSSSQAEATTiGPSSTSNAL 

S VniiDEEIJjCLGIiADPAPNVP PKES AG^SQWHIJjQR^QSDLDF FS 

PRPGTAACGASDAPLLQPSAPSSSSSQAPLPPPFPAPWPASVP 

APSAGSSLFSTGVAPAI^KVEPAVPGHHGLJU/5NSAI^HIX)A1> 

DQLLEEAKVTSGLVKPTTSPLIPTTTPARPIJjPFSTGPGSPLFQ 
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SEQ 
xu 
NO: 


Predicted 

fr r> i net 
t^^y 1 |1I1 - i - Ai j 

nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A= Alanine, C= Cysteine , D=Aspartic Acid, E=; 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, l=Isoleucine, K= Lysine, 
li=Leucine, M=Methionine, N=Asparagine , 
P-Proline, Q=Glutaraine , R=Arginine , 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X«Unknovm, *=»Stop 
Codon, /=possible nucleotide deletion, 
\apossible nucleotide insertion) 








PLS FQSQGS PPKGPELSLAS IHVPLES I KPSSALPVTAYDKNGF 
RILFHFAKECPPGRPDVLWWSMUJTAPLPVKSIVLQAAVPKS 
MKVKI^PPSGTKT.SPFSPIQPPAAITQVMIJuANPI,KEKVRX,RY 
LTFALGEQLSTHVGEVDQFPPVEQWGNli 


6799 


3894 


1696 


STI SWESLESWLNKATNPSNRQEDWE YI IGFCDQINKELEG * VS 
ALWGQLRGSGL^RGTTMAKEGQPGSPRLhSALECVLLVPQXPQIA 

VRLXAHKI QS PQEWEALQALTYXiGDRVS E KVKTKV I EUJYS WTTl 

ALPESAKIKDAYHMLKRQGIVQSDPPIPVDRTL»1PSPPPRPKNP 

VTODEEKSICLLAiCLLKSKNPDDLOEANKLIKSMVREDEARIQKV 

TKRIiHTLEEVNNNVRlJjSENttJjHYSQEDSSDGDI^I^K^ 

ENKRRTLFKXASETEDNDNSLGDILQASDNLSRVINSYKTI I EG 

QVINGEVATLTliPDSEG^S(X^SNQG^LIDl^LDTTNSI/SSV]UA 

PAPTPPSSGIPILPPPPQASGPPRSRSSSQAEATIjGPSSTSKAL 

SWLDEEI,LCLGliADPAPNVPPKESAGNSQWHLIjQREQSDLDFFS 

PRPGTAACGASnAPLLQPSAPSSSSSQAPLPPPFPAPWPASVP 

APSAGSSLFSTGVAPALAPKVEPAVPGHHGLALGNSAl>HHIiDAI, 

DQLLEEAKVTSGLVKPTTSPLI PTTTPARPLLPFSTGPGS PLFQ 

PLSFQSQGSPPKGPELSLASIHVPL.ESIKPSSALPVTAYDKNGF 

B T T,FHP AlCFCPPGRPDVliVVVVSMLiNTAPLPVKS I VLQAAVPKS 
MKVKLQPPSGTELSPFSPIQPPAAITQVMLLlANPLKEKVRLRYK 

1»TFALGEQI*STEVGEVDQFPPVEQWGN1i 


6800 


404 


1646 


RRSPSTGLS PVPQPSS PSLSDYS I PWSIiLLSGTIAWATPGK* AG 

* PQAW* LGIiAPAlAFI /GLTRGRKQN KE KMAEGGSGDVDDAGDC 
SGARYNDWSDDDDDSNESKSIVWPPWARIGTEAGTRARARARA 
RATRARRAVQKRASPNSDDTVLSPQEI^K^CLVEMSEKPYILE 
AALIALGNNAAYAFNRDI IRDLGGLPIVAKI LNTRDP I VKEKAL 
I VLNNI^ VNAENORRL KVYMN QVCDDT I T S R LNS S VQ LAGLRIjLi 
TNMTVTN E YQHMIiAN S I SDFFRIiFSAGNEETKLQVLKLIiIjNIAE 
NPAMTRELLRAQVPSSLG\SLFNKKENKEVILKLLVI FENINDN 
FKWEENEPTQNOFGEGSLFFFLKEFQVCADJCVLGIESHHDFLVK 
VKVGKFMAKLAEHMFPKSQE 


6801 


2 


1755 


SAEEFESQQASVTMH0VDAESFEVLVDYCYTGRVSLSEANVERL 
YAAS DMLQLE YVREACAS FLARRLDLTNCTAI LKFADAFGHRKL 
RSQAQSY I AQNFKQL SHMGS I REETIiADLTltAQLLAVLRLDSLD 
VESEQTVClIVAVQWIiEAAPKERGPSAAEWKC^WMHFTBEDQD i 
YLEGIiLTKPIVKKYCLDVIEGAIiQMRYGDLIiYKSLVPVPNSSSS 
/R* QQQLSCI CSRKSTPETGYVCQGDGDLLWTPQRSLS \RYDPY 
SGDI YTMPS PLTSFAHTKTVTSSAVCVS PDHDI YLAAQPRKDLW 
VYKPAQNS WQQI*ADRLLCREGMDVAYI>NGYI YILGGRX>PI TGVK 
LKEVEC YS VQRNQ WALVAP VPHS FYS FEL I WQNYIiYAVNS KRM 
LCYDPSHNMWLNCASLKRSDFQEACVFNDEIYCICDIPVMKVYN 
PARGEWRRISNIPLDSETHNYQIVNHDQKLI*I*ITSTTPQWKKNR 
VTVYEYDTREDQW INIGTMLGLLQFDSGFICLCARVYPS CLEPG 
QSFITEEDDARSESSTEWDLDGFSELDSESGSSSSFSDDEVWVQ 

VAPQRNAQDQQGSL 


6802 


157 


1341 


ETFPLFFFLLSKTPGKTASMAHFVQGTSRMIAAESSTEHKECAE 

PSTRKNI^SLEQK^RCI^KQRKELLEVNQQWDQQFRSMKELYE 

RKVAELKTKLDAAERFLSTREKDPHQRQRKDDRQREDDRQRDLT 

RDRIX^REEKEKSRIiNEELI^LKEEinCLLKGKNTLANK^ 

EI KRLNKALQDALNI KCS FSEDC1.RKSRVEFCHEEMRTEMEVLK 

QQVQ I YEED FKKERSDRERLNQEKEEIiQQI NETSQSQI*NRLNSQ 

IKACC^JEKEKLEKQLKQMYCPPCNCGIjVFHIiQDPWVPTGPGAVQ 

KQREHPPD YQW YALDQLP PDVQHKAN /DWCIAPPPVCCQAG / PR 

TPGLK+ S S CLWLPKC * NFRFILS KES PS VEVHTNRERQQATRER 

G 


6803 


1 


2203 


K-LSGR P YRHMG VLGTS KL YD I RKT I FTFTPQF IDQQQFYLALDN 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corr e spondi ng 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A= Alanine , C= Cysteine . D=Aspartic Acid, B= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K= Lysine, 
L= Leucine, M-Methionine, N^Asparagine , 
P=Proline, Q=Glut amine, R=Arginine, 
S=Serine, T=Threonine , V-Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 


• 






KMIV^MLilTDbSYbCSRWRMTGQPTITFPISHSMIiDEDGTSLNS 
SI LAALRKMQIXSYFGGARVQTGKIjSEFLTTSCCTHLS fmdpgpe 
GKLYSEDYDDNYDYI^SGNWMNPYDSTSHAROn3EVARYIJ3HI^ 
AHTAPHPKIAPTSQKGGLDR^QAAVQTTCn3LMSI»WKAKEI^ 
NVHMYLPTKLFQASR PS FNLLDS PHPRQENQ VPSVR VE IHLPRD 
QSGEVDFKAIjVIjQLKETSSIjQEQADILYMLYTMKGPDWNTSX.YN 

ERS ATVREIiLTEL YGKVGE IRHWGL IRYISGI LRKKVEALDEAC 
TDLOjSHQKHIiTVGLPPEPREKTISAPI/PYEAIiTQLIDBASEGDM 
SIS II/FQE I MVYI*AM YMRTQPGIiFAEMFRLRIGLI IQVMATELA 
HSIJvCSAEEATEGLMNI^PSAMXl^LHH^ 

PTDSNVS PAI S IHE I GAVGATKTERTG I MQLKSEI KQVE FRRLiS 

ALNRVPVG F^QKWKVLQKCHG LS VEGFVLP SSTTREMT PGE I K 
FS VHVES \ VliNVLIiRP EYRQLL VEAI LVLTMLAD I E IHS IGS 1 1 
AVEKTVHIANDLFLQEQKTLGP \DDTMI*AKDPASG\ I CTLR\ YD 
SAPSGRFGTMTYLS \RAA\ATYVQEFLP\HS I CAMQ 


6804 


1 


951 


GSPGKKEEKAXNKESLCMENSSNSSSDEDEEETKAKMTPTKKYN 
GLEEKRKSLRTTGFYSGFSEVAEKRIKLI^SDEm^NSRAKDR 

KDVWSS IQGQWPKKTLKELFSDSDTEAAAS P PHPAPEEGVAEES 
LQTVAEBESCSPSVEIiEKPPPVNVDSKPIEEKTVEVNDRKAEFP 
SSGSNFSA* IPLPYIiHI*NRI 1 HQSL*QKGSRQQSSVTVSEPI i APN 
QEEVRSIKSETDSTIEVDSVAGEIiQDI J QSERE*L»ASRF*CQCKL. 

QQKEGKRHK 


6805 


1539 


206 


RQPDLKYFGKSFDVSVSESSSl^NDLPKFADGIKARNRNQNYL 

VPS PVLRI LDHTAFSTEKSADIVI CDEECDSPES VNQQTQEESP 
IEVHTAEDVPIAVEVHAISEDYDIETE1WSSESLQDQTDEEPPA 

KLCKI LDKSQAI*NVTAC^KWPI*LRANSSGLY KCELCEFNSKYFS 
DLKQK^JILKHKRTDSNVCRVCKESFSTNMLLI EHAKLHEEDPYI 
CKYCDYKTVI FENLSQH I ADTHFSDHIjYWCEQCDVQFS S SSEIiY 
LHFQEHSCDEQYLCQFCEHETTTDPEDIjHSHVVNEHACKIjIE^ 

kynngehgqysllskitfdkcknffvcqvcgfrsrlhtnvnrhv 
aiehtkifphvcddcgkgfssmleXlakhi^shi.segiylcqyw 
eystgqiedlkihij^fkhsadlphkcsda^nvfgnerelishijp 

VHETT 


6806 


272 


3 794 


VAI*CFPNSDPVMF^IDAFYGCIJLAE1^PVPIEVP1 J TRKDAGSQQV 
GFIxlX5SCGVFliALTTDACQKGLPKAQTGEV^ pls wlvi 
DGKHLAKPPKDWHPLAQDTGTGTAY I EYKTS KEGSTVGVTVSHA 

S LiIaAQCRALTQACGYS eaetltnvldf krdag lwhg vlt s vmnr 
mhvvsv^yal^kanplswiqkvcfykaraalvksrdmhws 
rgqrdvsmsi*rmlrvadganpwsi sscbaflnvfqsrgiirpev 
icpcasspealtvairrppdlggppprkavlsmnglsygvirvd 
teeklsvltvqdvgqvmpganvcwklegtpylcktdevgeicv 
s s s atgta yygllg i t knvfeavpvttggap i fdrpftrtgllg 
f igpdhlvf i vg kxjx5lm vtg vrrhnad dwatalave pmkfvy 
rgr iavfsvtvlihddr ivlvaeqrp das beds fqwmsrvlqa i d 
s ihqvgtvycl»a1»vpantlp kaplgg i h i setkqr flegtlhpcn 
vlm cphtcvtni»pkprqkqpevgpasmi vgniivagkr iaqasgr 

PT-RWT.renfi-nQAB yFT 'PTiADVI^WRAHTTPDHPLFIjIiLNAKGTVT 
STATCVQl^KRAERVAAAl^KGRLSVGDHVALVYPPGVDLl^^ 
FYGCL YCGCVPVTVRP PHPQNI^TTIjPTVKMI VETVSKSACVLTT 
QAVTRLLRSKEAAAAVDIRTWPTILI^ 

DVLAYLDFSVSTTGI LAGVKMSHAATSA1XRS IKLQCELYPSRQ 
IAICIJ3PYCGLGFALWCLCSVYSGHQSVI.VPPLELESNVSLWLiS 

AVSQ YKARVTFCCYSVMEM CTKGIiGAQTGVI*RMKG VNI*S CVRTC 
MWAE ERP \ R IALTQS FS KL» FKDLGL P ARAVS TT FGCRVNVAI C 
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SEQ 
ID 
NO: 

• 


Predicted 
beginning 
nucleotide 

lu\>u l» J- Wll 

corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresDOndina 
to first 
amino acid 
residue of 
amino acid 
sequ&nce 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine. I=Isoleucine . K= Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P= Proline, Q=Glut amine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








LO^TAGPDPTTVYVDMRALRHDRVRIjVEKGSPHSLPLMESGKIL 
PGVKVI XAHTETKGPLGDSHLGE I WVSS PHNATGYYTVYGEEAL 
HADH FSARLS FGDTQT I WARTG YLG FLRRTELTDASGGRHDAL Y 
WGSLDETLELRGMRYHPIDIETSVTRAHRS IAECAVFTWTNLL 
VVWEXDG L EQD ALDL VAL VTNVVLEEIfYLVVG VVV I V D P G V I P 
INSRGEKQRMHLRDGFLADQLDP I YVAYNM 


6807 


1444 


606 


VGHDTVHAMFTCFPKCLGFSP PVNVTVSPRSEESHTTTVSGGNG 
. SVFC^GPQLO^OiANIiEARRGSIGAALSSRDVSGLPVYAQSGEPR 
RLTQAQVAAFPGENALEHSSDQDTWDSLRS PGFCSPLS SGGGAE 
SIjPPGGPGHAEAGHI/5KVCDFHLNHQQPSPTSVIjPTEVAAPPL»E 

kilsvdsvavdcayrtwkpgpqpgphgsixtegclrslsgdln 
flfrlpmglscplqvq 


6B08 


2063 


737 


GVGSGAASALARSRPLASRLSSFJU*TRAPRSGAMQRLAMDLRML 

SRELSLYLEHQVRVGFFGSGVGLSL ilgfsvayafyylss iakk 

PQLVTGGES FSRFXQDHCPVVTETYYPTVWCWEGRGQTLLRPF\ 

ITSKPPVQYRNELIKTADGGQISIiDVJFDNBNSTCYMDASTRPTI 

LI^PGLTGTSK^SYILHMIHI^EEIX3YRCWFNNRGVAGEN 

nn •PVY'V** AMTtyrtT PTV t TTTTVTT-IQT ,vpqq tjttt . » a V^MfJfSMT .T.T ,myt. 
.fit X XL i >\yv .1 p> i j | »r». i vi IXti V noij 1 tr S>*\tr JC Junnu voi'iuui JUiJJJi' * 

GKIGSKTPLMAAATFSVGWNTFACSESLEKPIiNWI^FNYYIjTTC 
LQS S VNKHRHMFVKQVDMDHVMKAKS I REFDKRFTS VM FG YQT I i 
DDYYTDAS PS PRLKSVG I PVLCLNSVDDVFS PSHAI P I ETAKQN 
PNVALVLTS YGGHIGFLEG I WPRQS TYMDRVFKQ FVQAM VEHGH 
ELS 


6809 


939 


65 


DYSGQTPVPTEHGMTI.YTPAQTHPEQPGSEASTQPIAGTQTVPQ 

GQFGKI LDVEI I FNERGSKGFGFVTFETSSDADRAREKLNGTIV 
EGRKIEVNNATARVMTNKKTGNPYTNGWKLNPVVGAVYG PEFYA 
VTGFPYPTTGTAVAYRGAHliRGRGRAVYNTFRAAPPPPPI PTYG 
AWYQDGFYGAE I \ LEAT Q PTDTLS PLQRRQ PTATVTAE S TQLP 
TRT 1 TPSG PRRP TALE P CETFHRFLLGP 


6810 


939 


65 


DYSGQTPVPTEHGOTLYTPAQTHPEQPGSEASTQPIAGTQTVPQ 
TDRAAOTDS OPLHPSDPTE KOOP KRLHVSNI P FRFRDPDLROMF 
GQFGKI LDVEI I FNERGS KG FGFVTFETSSDADRAREKIiNGTIV 
EGRKI E VNNATARVMTNKKTGNP YTNG WKLNP VVGAVYG PEFYA 
VTGFPYPTTGTAVAYRGAHI^GRGRAVYNTFRAAPPPPPIPTYG 
AWYQDGFYGAE I \ LEATQ PTDTLS P LQRRQ PTATVTAE S TQLP 
TRTITPSGPRRPTALEPCETFHRFLLGP 


6811 


1522 


658 


DLVTVWSFVDCRVIASTHGH\KSWVSVVAFDPYTTSVEEGDPME 
FSGSDEDFQDLLHFGRDRADSTQCRLSRRNSTDSRPVSVTYRFG 
SVCK)DTQLCLWDLTEDILFPHQPLSRARTHTNVMNATSPPAGSN 
GNSVTTPGNSVPPPLPRSNSLPHSAVSNAGSKSSVMDGAIASGV 
S KFATLS LHDRKERHHEKDHKRNHS MGHI SSKSS DKLNLVTKT K 
TDPAKTLGTPLCPRMED V PLLEPLI C KKI AHERLTVLI FLEDCI 
VTACQEG FICTWGRPGKWSFNP 


6812 


4001 


1682 

• 


EDA VFSLDLSTI I QG TWFLNGEELKSNEPEGQVE PGALRYR IEQ 

KGLQHRL I LHAVKHQDSGALVG FS CPGVQDS AALT I Q ES PVH I L 

SPQDKVSLTFTTSERVVLTCELSRVDFPATWYKDGQKVEESELL 

WKMDGRKHRL1 LPEAKVQDSGE FE CRTEGVSAFFGVTVQD P P V 

HI VDPREHVFVHAITSECVMLACEV \DR\EDAP VRWYKDGQBVE 

ESDFVVLE^JEGPHRRLVLPATQPSDGGEFQ^VAGDE^YFTVTI 

TDVSSWIVYPSGrCVYVAAVRLF4*WLTCELCR^ 

E VVES PALLLQKEDTVRRLVLPAVQLEDSGEYLCE I DDESAS FT 

VTVTEPPVRI I YPRDEVTL IAVTLE CWLMCELSREDAPVRWYK 

DGLEVEESEALVLERDGPRCRLVLPAAQPEDGGE FVCDAGDDSA 

FFTVTVTEPPVQFLAI^TPSPIX^APGEPVVLSCEIoSRAGAPV 
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Amino acid segment containing signal peptide 
(A»Alanine, C-Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine , K=Lysine, 
L=I#eucine, M=Methionine, N=Asparagine, 
P= Proline , Q=Glutamine , R=Arginine , 
S=Serine, T=Threonine , V= Valine, 
W=Tryptophan , Y«*Tyrosine, X= Unknown , *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








VWSHNGRPVQEGEGLBLHAKGPRRVLCIQAAGPAHAGLYTCQSG 
AAPGAPSLS FTVQVAEP PVRVVAP EAAQTRVRS TPGGDiiELWH 
LSG PGGP VR WYKDGERLAS QGR VQLEQAGAR Q VLR VQGARS GDA 
GEYLCDAPQDSR I F1»VSVEEPLI»VKLVSDLTPLTVHEGDDATFR 
CEVSPPDADVTWU^GAVVTPGPQRQSCC^ 

CVSKWRQAEWVQRGPCAGCEVGSPCPTTLACPNPRMGTSTASSS 
MVSYWPTRAPTAARATTIAPWPGSA 


O O 


9 


836 

• 


SSTQQRPGVPAGPRPLDGYLGVADHKPLKKHCRDCAL.VTSSGHL 
LHSRQGSQI DQTECVI RMNDAPTRGYGRDVGNRTSLRVIAHSSI 
QR I LRNRHDUjNVS QGTV F I FWG PS S YMR RDG KGQ VYNNLHLI*S 
QVLPRLKAFWITRHKMLQFDEL.FKQETGQ\NRKISNTWLSTGWF 
TMTIALEIiCDR I NVYGMG P PDFCRD PNHPS VP YH Y YE P FG PDEC 
TMYLSHERGRKGSHHRFITEKRVFKNWARTFNIHFFQPDWKPES 
LAINHPENKPVF 


6814 


3 


737 


KFRRQEAN/ARERNRMHGIiNDALDNliRKW PCYS KTQKL.S K I ET 
LRLAKNY I WALS EI LRIGKRPDLLT FVQNLCKGLSQPTTNL.VAG 
CLQLNARS FLMGQGGEAAHHTRSP YSTFYP PYHS PEI/TTPPGHG 
Tr^DNSKSMKPYNYCSAYES FYESTSPECAS POFEGPLSPPPINY 
NGIFS UCQEETIJJYGKNYNYGMHYCAVPPRGPLGQGAMFRLPTD 
SHPPYDLHLRSQSI>TMQDEI*NAVFHN 


6815 


906 


553 


G<3IjDPASQTKVVEIiIjKDGSGRRGDRRSSRDMAGGAGPRSESDIjE 
DVGPTAEWNGDGSGSLRRSGSFGKLRDALRRSSEMLVKKLQGGT 
PQEPPNPRMKRASSLNFLNKSVEEPTQPGG 


<58l6 


1 


803 


NLLKTHKF\LIX?QDEDSIJISVPVAQMGNYQEYI>KTLASPLREI D 
PDQPKRLHTFGN^FKQDKKGMMIDEADEFVAGPQNKVKRPGEPN 
SPMSS KRRRSMS IjLLRKPQTPPTVTNHVGGKGPPSAS WFPS YPN 
IjIKPTLVHTDATIIHDGHEEKMENGQITPDGFLSKSAPSEIjINM 
TGDIiM P PNQ VDS 1»SDD FTS LS KDGL I QKPGSNAFVGGAKNCS LS 
VDDQKDPVASTIX3AMPNTLQITPAMAQGINADI KHQLMKEVRKF 
GRSK 


6817 


172 


3457 


LGMMDS PKIGNGLPVIGPGTDIGI SSIaHMVGYLGKNFDSAKVPS 
DEYCPACKEKGKLKALKTYRISFQESIFLCEDLQCIYPLGSKSL 
mLISPDLEEC^TPHKPQKRKSLESSYKDSLI^J^SKKTRNYIA 
IDGGKVIWSKHNGEVYDETSSNIiPDSSGQQNPIRTADSLERNEI 
LEADTVDMATTKDPATVDVS GTGRPS PQNEGCTS KLEMPIiESKC 
TS FPQALCVQWKNAYAIjCWIjDCI LSAIiVHSEELKNTVTGLCS KE 
ESI FWRKLTKYNQANTLLYTS QLS GVKDGD CKKLTS E I F AE I ET 
CLNEVRDE I FISLQPQLRCTLGDMES P VFAFPLLLKLETH I EKL 
FI»YSFS WDFECSQCGHQYQNRHMKSLVTFTNVI PEWHPLNAAHF 
GPCNNCNSKSQIRKMVLEKVS P I FMLHFVEGLPQNDLQHYAFHF 
EGCLYQITSVIQYRANNHFITWILDADGSWLECDDIiKGPCSERH 
KKFEVPAS E IH I VI WERK I SQVTDKEAACLP I*KKTNDQHAI*SNE 
KPVSIjTS CS VGDAAS AETAS VTHPKD I SVAPRTLSQDTAVTHGD 
HLLSG PKGLVDN I LPLTLEETIQKTAS VSQLNSEAFL\liENKPV 
AENTGILKTOTl^SQESLMASSVSAPCNEKLIQDQFVDISFPSQ 
WNTNMQS VQLNTEDTVNTKSVNNTDATGLIQGVKSVEI EKDAQ 
LKQFI/TP KTEQLKPERVTSQVSNLKKKETI7U>SQTTTS KSLQNQ 
SLKENQKKPFVGSWVKGLISRGASFWPLCVSAHNRNTITDIjQPS 
VKGVNNFGGFKTKGINQKASHVSKKARKSASKPPPIS KP PAGPP 
SSKGTAAHPHAHAASEVLEKSGSTSCGAQLNHSSYGNGI SSANH 
EDLVEGQIHKLRLKLRJCKIiKAEKKKIjAAI^lSSPQSRTW 
QVPQDGS PNDCES IEDLLNEI.P YP IDIANES ACJLT VPGVSLYSS 
OTHEEII*AELLS PTPVSTELS ENGEGDFRYLGMGDSHIPPPVPS 
EFNDVSQNTHLRQDHNYCS PTKKNP CEVQPDSLTNNACVRTLNL 
ESPMKTDI FDEFFS SS ALNALANDTLDLPHFDEYLFENY 


6818 


2 


24 0 


RGFDKVI»WT/LSGAVK\CVQFSRISPIX3EEGYPGEtjICVVA^TYTLt 
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Amino acid segment containing signal peptide 
(A= Alanine , C«Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine , G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P-Proline, Q=Glutamine. R=Arginine, 
S=Serine, T= Threonine, VaValine, 
W=Trypt ophan , Y=Tyrosine, X= Unknown , +=Stop 
Codon, /=possible nucleotide deletion, 
\»possible nucleotide insertion) 








DGGE/ LHS /ATTEHKP/ VQATPVNIjT\TILTSTWQARLPQI 


6819 


1 


961 


GIPCTEMGNFT>NANVTGEIEFAIHYCFKTHSLiEICIKACKNI*AY 
GEEKKKKCNPYVKTYl^PDRSSQGKRKTGVQRNTVDPTFQETLK 
YQVAPAQLVTRQI^IVS VWHLGTLARRVFLGEVI IPLATWDFEDS 
TTQSFRWHPLRAKADKYBDSVPQSNGELTVRAKLVLPSRPRKLQ 
EAQEGTDQPSI^GQLC^VVLGAKNLPVRPIXSTI^SFVKGCLTLP 
DQQ KLiRXiKS P VL»RKQ AC P QW KHS FVFSGVT PAQLRQS SL.ELTVW 
DQALFGMNDRLLGGT \ RLGS KG DTAVGGDACS QSKLQ WQKVI>S S 
PNLWTDMTLVLH 


6820 


1014 


340 


GDM V Y IVGHVP PG FFE KTQN KAW FREG FNE K YL KWRKHHRVI A 
GQ F FGHHHTDS FRMLYDDAG VP I S AMF I TPG VTP WKTTLPGWN 
GANNPAIRVFEYDRATLSIJCDMVTYFttlNI^ 
QLTEAYGVPDASAHSMHTVLDRIAGDQSTLQRYYVYNSVSYSAG 
VCDEACSMQHVCAMRQVDIDAYTTCDYASGTTPVPQLPLLLMAL 

LGLCT 


6821 


1088 


518 


EFDI Y R / BVGGEFVP VTRDDS SNG F PRTQHG PS PTVH P I QS PQN 
RFCVLTLDPETLPAI ATTLI DVLF YSHSTPKEAASSS PE PSS IT 
FFAFSL»I EG YI \SIVMDAETQKKFPSDLLLTSSSGELWRMVRIG 
GQPU3FDECGIVAQIAGPLAAADISAYYISTFNFDHALVPEDGI 

GSVI EVLQRRQEGLAS 


6822 


1088 


518 


EFDI YR /EVGGEFVPVTRDDSSNGFPRTQHGPS PTVHPIQSPQN 
RFCVLTLDPETLPAIATTLIDVLFYSHSTPKEAASSSPEPSSIT 
FFAFSI*IEGYl\SIVMDAETQKKFPSDLLbTSSSGELWRMVRlG 
GQPLG FDECG I VAQI AG P1AAAD I SAY Y 1ST FNFDHALVPEDG I 
GSVIEVLQRRQEGLAS 


6823 


6 54 


221 


PPKUL.SRWARMGHGDEIVXIjSDI.NFPGIJJHLPWGPWRSVQTAC 
GI PQLLEAVLKIjIjPIjDTY VESPAAVMEIiVPSDKERGLQTPVWTE 
YES I LRRAGCVRAIjAKI ERFEF YERAKKAFAVVATGETALYGNL 
ILRKGVLALNPLL 


6824 


8S8 


104 


kliaqrwgwg\ccffsiavsvkmnvli,fapgijjfi*i-i»tqfgfrg 
ALPKLG ICAGLQWLGI*PFI*L£NPSG ylsrs fdlgrqflfhwtv 
NWRFLPE^FLHRAFHLAIiLTAHLTLLLLFALCRWHRTGES 1 LS 
LLRDPS KRBCVPPQPLTPNQ 1 VSTI*FTSNFIGI CFSRSLHYQFYV 
WYFHTLPYIjIWAMPARWLTHLLRIiLVIXSLIELSWNTYPSTSCSS 
AALHICHAVILLQIiVTLGPQPFPKSTQHSKKAH 


6825 


3 


1173 


SSGEFGLQASDIMWTISDTGWII.IIIjCSLMEPWA1/5ACTFVHLL 
PKFDPLVILKTLSSYPIKSMMGAPIVYRMLLQQDLSSYKFPHLQ 
NCLAGGESLLP ETLENWRAQTGLD ire F YGQTETGLTCMVS KTM 
KIKPGYMGTAASCYDVQI IDDKGNVL PPGTEGD IG I RVKP I RPI 
GI FSG YVDNPDKTAANIRGDFWLLGDRG IKDEDGYFQFMGRADD 
I INSSGYRIGPSEVENALMEHPAWETAVISSPDPVRGEVVKAF 
VILALQFIjSHDPEQLTKELQQHVKSVTAPYKYPRKI EFVLNLPK 
T\TTGKIQRA\KLRDKEWKMSGKAPCAVRHLRDIHI,DSPIiIiSIjSF 
PFGPLA1jPMDGYGDSLWEEHEYKFCI±AIjVISTKLYHVRC 


6826 


23 04 


954 


LKTESFKPW/VNIAIJUTHIjLGERASPNSFWQPYIQTLPREYDTP 
LYFEEDEVRYLQSTQAIHDVFSQYKNTARQYAYFYKVIQTHPHA 
NKLPLKDS FTYEDYRWAVSSVMTRQNQI PTEDGSRVTLAIi 1 PLW 
DMCNKTNGLITTGYNLEDDRCEGVALQDFRAGEQI YI FYGTRSN 
AEFVIHSGFFFDNNSHDRVKI KLGVSKSDRLYAMKAEVIiARAGl 
PTSSVFALHFTEPP I SAQLLAFLRVFCMTEEELKEH1.LGDSAID 
R I FTLGNS EF^ VS WDNEVKLWTFLEDRASLLLKTYKTTI EEDKS 
VLKNHDLS VRAKMA I KLRLGEKEILEKAVKSAAVNREYYRQQME 
EKAPLPKYEESNLGLLESSVGDSRIiPLVLRNLEEEAGVQDAIiNI 
REAIS KAKATENGLVNGENS I PNGTRS ENESIjNQES KRAVEDAK 
GSSSDSTAGVKE 
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Amino acid segment containing signal peptide 
(A= Ala nine, C=Cysteine, D^Aspartic Acid, E= 
Glutamic Acid, F=Phenyl alanine , G=Glycine, 
tt»ni3tiulne, -I = isoieuc Ule , K=uysine , 
L=Leucine, M=Methionine, N-Asparagine, 
P-Proline, Q=Glutaraine , R=Arginine, 
S=serine, T= Threonine. v=valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 


1 6827 


1 


779 


SS VVEFGIjSVI/XSLFI^FVTjENML^IjIi^^ 

ETRNIjDPBNGSGMAIjQPLQAAPBPGAQGXJREKNSQHPPAIiAPPG 
HGGHSHGHQGGTDITWMVIJJGDGIJi^^ 

LSTTLAVFCHELPHELGDFAMLLQSGLS FRRLUiLSLVSGALGL 
GGAVI/^LSLGPVPLTPWVFGVTAGVFLYVAI*VDMLPAIiFPSS 
GAPAYANHVLLOGLGIkLLGGCLMItAI TLLEERLLPVTTEG 


6828 


3 


1654 


KSQHG/WILQLMHSCKEGYVKDLKGNPGLHiiAMIiDIiDNGTRPSE 
LGHLSQTAS LKRG S S FQ SGRD DTWR Y KT PHR VAFV E KJjTKLjVTjS 
QLPNFWKLWI S YVNGSLFSETAEKSGQI ERSKNVRQRQNDFKKM 
IQEVMHSLVKLTRGAUjPLS I RDGEAKQYGGWEVKCELSGQWLA 
HAI QTVRLTHE SI/TALE I PNDLLQT I QDL ILDLRVRCVMATLQH 
TAEEIKRIiAEKEDWIVDNEGLTSLPCQFEQCIVCSLQSLKGVLE 
CKPGEASVFQQPKTQEEVCQLS INI MQVFI Y CLEQLS TKPDAD I 
DTTHLSVDVSSPDIjFGS IHEDFSLTSEQRLLIVLSNCCYIiERHT 
FLN IAEHFEKHNFQGI EK3TQVSMASLKELDQRLFENY IELKAD 
PIVGSLEPGIYAGYFDWKDCLPPTGVRNYLKEALVNI iavhaev 
FTI SKELVPRVLSKVI EAVSEELSRLMQCVSS FSKNGALQARL1S 
ICALRDTVAVYLTPESKSSFKOA1*EAIjPQI»SSGADKKI*LEEL»LiN 
! KFKSSMHLQLTCFQAASSTMMKT 


6829 


1 


782 


MRMEAGEAAPPAGAGGRAAGGWGKWVRI^VGGTVFLTTRQTLiCR 
EQKS FLSRLCQGEELQSDRDETGAYLIDRDPTYFGP IIiNFLRHG 
KLVLDKDMAEEGVUJEAE FYNI GPL 1 RI 1 KDRMEEKD YTVTQVP 
PKHVYRVLQCQEEEiTQMVSTMSDGWRFEQLVNIGSSYNYGSED 
QAEFTjCVVSKEI^STPNGLSSESSRKTKSTEEQLEEQQQQEEEV 
EEVEVEQVQVEADAQEK/ CCYKPEAPGCEAPDHIjQGIjG VP I 


6830 


1 


93 9 


MRPGSVENLS I VYRS RD FIj WNKHWDVR I DS KAWRETLTLQKQLi 
RYRFPELADPDTCYGFRFCHQIiDFSTSGALCVAIJTKAAAGSAYR 
CFKERRVTKAYLALIiRGHIQESRVT I SHAIGRNSTEGRAHTMCI 
EG SO^CENP KPSLTDL VVLEI-IGI#YAGDP VSKVI»IiKPIiTGRTHQL» 
RV\HCSAI^HPVVGDLTYGEVSGREDRPFRMM1->HAFYLRI ptdt 
ECVEVCTPDPFIiPSIiDACWSPHTLLQSLDQLVQAIiRATPDPDPE 
DRGPRPGS PS ALLPGPGRPPPPPTKP PETEAQRGPCL»QWI>SEWT 
LEPDS 


6831 


3 


1087 


SLFFGSSTPDNKVAEQEDLETQPSPSVEKAVTVIDPEGTIPTNF 
NVAEKPADHSLSEVKLKTADEPRGTLVKSGDGQNVKEKSMIIiSN 
VEDI*Q£PKFISEVSREDYGKKEISGDSEEMNINSVVTSJuDGENLi 
EIQS YSLIGEKLVMEEAKTI VPPHVTDSKRVQKPAIAPPS KWNI 
SIFKEEPRSDQKQKSLLSFPVVDKVPQQPKSASSNFASKNITKE 
SEKPESIILPVEESKGSLIDFSEDRliKKEMQNPTSLKISEEETK 
I^VSPTEKKDNLENR\SYTL\AEKKVI>AEKQNSV\API>EIJU)S 

SEKEKDEKKKK 


6832 


1809 


412 


MGSGLISGPPQDNSGEALKEPERAQEHSLPNFAGGQHFFEYLI»V 

CFPDGNEWASLTEYPRETFSFVLTNVDGSRKIGYCRRIJjPAGPG 
PRLPKVYCI I SCIGCFGLFSKILDEVEKRHQISMAVI YPFMQGIj 
REAAFPAPGKTVTLKSF I PDSGTEFI SLTRPLDSHLEHVDFSSL. 
LHCLSFEQILQI FASAVIoERKIIFLAEGLSTLSQCIHAAAALLY 
PFS WAHTYI PWPESLLATVCCPTP FMVGVQMRFQQEVMDS PM2 
EVLLVNIX^GTFI.MSVGDEKDILPPKLQDDILDSLGQGINEIjKT 
AEQ INEHVSG P F VQFFVKI VGH YAS Y I KREANGQGHFQERS FCK 
ALTSKTNRRFVKXFVKTQLFSLFIQEAEKSKN^ 
YEEQKKQ/TETKGKNCEIRAWNKND 


6833 


1 


1129 


PLMTLSQCGGIPGHGHSHGGHGHGHGLPKGPRVKSTRPGSSDIN 
VAPGEQGPDQF^TNTLVANTSNSNGLKLDPADPENPRSGDTVEV 
QVNGNLVT^PDHMELEFJ)RAGQI2^RGVFL^TVI^ 
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Amino acid segment containing signal peptide 
<A=Alanine, C= Cysteine, D=Aspartic Acid, E- 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 

L=I*eucine, M=Me thionine, Ns=Asparagine , 
P= Proline, Q=Glu t amine , R=Arginine, 
S=Serine , T=Threonine, V^Valine, 
W=Tryptophan , Y=Tyrosine, X=»Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 






• 


NAIiVFYFSWKGCSEGDFCVNPCFPDPCKAFVEI INSTHASVYEA 
GPCWVLYLDPTLCWMVCI LLYTTYPLiKESALILLQTVPKQI D 

MEVAKTI KDVFHNHG IHATTIQPE FASVGSKSS WPCELACRTQ 
CALKQCCXSTLPQAPSGKDAEKTPAVSISCLEIiSNNIiEKKPRRTK 
AENT PA\ WI EIKN\ I PNK\QPESSL 


6834 


78 


1151 | 


AGQERPAPIWRLL.WtPTPSVSRKAEPAHIPXNR*GA*E*RGGLP 
LCGSSASAYGWH* RLTPWSPGGS + HM* SSKAPVTQARE VLVAG P 
CSKLVLSGARG I VGTTVQVLVEAQQPLLLIjFTGVWGLNIjRAGEE 

nnxT * T T DTTT mm mlMMTT fill TV t ft Tf+ TV /V"*T C/V/^IT/* 1 C TA T TV VTVT T O 

AAAAVRDCKEVXYTV SGDKQQAEVS VRL * VRDVCVEEAGCVE FGQ 
AHGRPGLAIAKGRGGTNEVEEQVQVDGVQKLVLSAHECHELVAG 
QQDGEDQAARTRLLQAGAHSVAHGRRQGQAPCRPHQEAGVSCHE 
LQQ WGDAL * ARE * APQ 1 I VLLLLEDVAQLRTG KKA * DLWDVE 
QLLRQL 


6835 


1 


834 


G I PAADR \ EASLELi I KI>D ISRT F PNIiC I FQQGG P YHDMLHS I LG 
AYTC YRPDVG YVQGMS F IAAVL.I LNLDTADAF I AFSNLLNKPCQ 
MAFFRVDHGLMLTYFAAFEVFFEEI^PKLFAHFKKNNl.TPDIYL, 
IDWIFTIjYSKSLPLDLA(^IWDVFCRIX3EEFLFRTALGILKLF^ 
DII*TKMDFIHMAQFLTR1>PE3DLPAEELFAS1ATIQMQSRNKKWA 
QVLTAI^KDSREMREGKS VTPTLRLQREFAIiGTNQSPMPRPLCC 

FRIiTPGQPRRTDAL 


6836 


1 


850 


MS CGRPP P D VDGM ITLKV \DNLTYRTS PDSLRR VFEKYGRVGDV 
YI PREP,OTKAPRGFAF\ntFHDRRDAQDAFAAMDGAEIJX3REL.RV 
QVARYGRRDLPRSRQGRRHAAGPEAA/RYGRRSRSYGRRSRSPR 
RRHRSRSRGPSCSRSRSRSRYRGSRYSRSPYSRSPYSRSRYSRS 
PYSRSRYRESRYGGSHYSSSGYSNSRYSRYHSSRSHSKSGSSTS 
SRSASTSKSSSAi<RSKSSSVSRSRSKoKi>t>oM 1 KorJ?R V £>itKKo 
KSRSRSKRPPKSPEEEGQMSS 


6837 


1 


1369 


TDGAAVAGNPGSDYFPGGTAP/GGPRTRRP\SGTSSSGSKASGP 
PNP PAQGDGTSLS PNYTLESTSGNDGKPVSGGGGRGRGRRKRDS 

RG APTPHEKA1»TS PS WG KGAEIjLLGDQ PDL I GSLDGGAKS DS S S 
PNVGEFASDEVSTSYANEDEVSSS SDNPQALVKASRS PLVTGSP 
KLPPRGVGAGEHGPKAPPPALGIjGIMSNSTSTPDSYGGGGGPGH 
PGTPGLEQVRTPTSSSGAPPPDEIHPLEILQAQIQIxQRQQFSIS 
EDQPLGLKGGKKGECAVGASGAQNGDS elgsccseavks amsti 
DLDS I»MAEHS AAWYMPADKALVDS ADDDKT1AP WEKAKPQNPNS 
KEAHDLPANKAS ASQPG SHI^CLS VHCTDDVGDAKARAS VP TWR 
SLHSDISNRFGTFVAAI/T 


6838 


16 


499 


bTDTPPPKTHMIHHSISDYKATLRCWALGFYPMEITLTWQQDEE 
DOTRDMRTiVPTR PAGDGTFOKWAAWVPSGEE /O /RYMCHVOHE 
GLPEPLTl^WEQSSQPTIPIVGIVAGLVLIiGAVVTGAVVSAVMC 
RKKNSDRVS YS EAAS S DHAQG S D VS LTACKV 


6839 


1 


1195 


AAPAGGGPDPEALSAFPGRHLSGLS WPQVKRLDAIJjSEP I PIHG 
RGNFPTLS VQPRQIRAGG PQHPGGAG \ IHVHRVRLHGSAASHVX, 
HPESGLGYKDLDLVFRMDLRSEAS FQtTKAVVXiACLLDFliPAGV 
SRAKI TPLTLKEAYVQKLVKVCTDSDRWSIil SLSNKSGKNVELK 
FVDSVRRQFEFS IDS FQI ILDSLLLFGQCSSTPMSEAFHPTVTG 
ESLYGDFTEALEHLRHRVIATRS PEE I RGGGLLKY CHLL VRG FR 
PRPSTDVRALQRYMCSRFFIDFPDLVEQRRTLERYLEAHFGGAD 
AARR YACLVTIJm WNESTVCIJ^NHERRQTLDL I AALALQAIAE 
QGPAATAALAWR PPG TIX?\A7PATVNYYVT P VQ P LLAHA Y PTWL P 
CN 


6840 


4254 


2061 


ELQGDFS VPDVP KS MAWCENS ICVGFKRDYYliIRVDGKGS I KEL 
FPTG KQLE PLVAP LADG KVAVGQDDLTVVLNEEG I CTQ KCALNW 
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Amino acid segment containing signal pepticTe 
{A- Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K= Lysine , 
L=Leucine , M=Methianine , N=Asparagine , 
P=Proline, Q=Glutaraine, R=Arginine, 
S=Serine, T=Threonine, V«=Valine, 
W=. Tryptophan , Y-Tyrosine, X-Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\ -possible nucleotide insertion) 








TDI PVAMEHQPP YI IAVLPRY VEIRTFEPRLLVQS IELQRPRFI 
TSGGSNI I YVASNHFVWRLI PVPMATQIQQLLQDKQFELALQLA 1 
EMKDDSDS E KQQQIHH I KNLYAFNLFCOKRFDESMOVFAKLGTD 
PTHVMGLYPDIjIiPTDYRKQLQYPNPLPVLSGAELEKAHIiALIDY 
LTQKRSQLVKKLNDSDHQSSTSPI^EGTPTIKSKKKLIjQIIDTT 
LLKC YLHTNVAL.VAP LLRLENNHCHI EESEHVLKKAHKYS EL 1 1 

LYEKKGI^EKALQVIjVTX^SKKAN 

HLI FSYSVWVTjRDFPEDGLKI FTEDLPEVESLPRDRVLGFLIEN 
FKGLAI PYLEHI IHVV7EETGSRFHNCI>IQLYCEKVQGI*MKEY1iL 
SFPAGKTPVPAGEEEGELGEYRQKLLMFLEISSYYDPGRLICDF 
PF1X5LI^ERALLLGRMGKHEQALFIYVHILK1>TRMAEEYCHKHY 
DRNKIX3NKDVYLSLLRMYLS PPS IHCIX3P I KLELLEPKANLQAA 
LQVLELHHS ICLDTTKALNLLPANTQINDIRI FLEKVLEENAQKK 
RFNQVbKNIil*HAEFIjRV\QEERIIaHQQVKC I ITEEKVCMVCKKK 
IGNSAFARYPNGVWHYFCS \kevnpadt 


6841 


1 


3206 


TPS TTGTKSNTPTSS VPSAAVTPLNESLQPLGDYGVGS KNS KRA 
REKRDSRNMEVQVTQEMRNVS IGMGSSDEWSDVQDI I DSTPELD 
MCPETRLDRTGS S PTQG I VNKAFG I NTDSLYHELSTAGS E VT GD 
VDEGADIJ^EFSGMGKEVGNLLLENSQbliBTKNALNVVKN^ 
KVDQLSGEQE VLRGELEAAKQAKVKLENR I KELEEELKRVKSEA 
IIARREPKEEAEDVSSYLCTESDKI PMAQRRRFTRVEMARVLME 
RNQYKERLMELQEAVRWTEMI RAS REHPSVQEK ICKST IWQFFSR 
LFSSSSSPPPAKRPYPSGNIHYKSPTTAGFSQRRNHAMCPISAG 
S RP L E F FP DDD CTS S ARRE QKREQ YRQ VREHVRNDDGRLQ ACGW 
SLPAKYKQLS PNGGQEDTRMKNVPVPVYC^PLVEKDPTMKLWCA 
AGVNI>SGWRPNEDDAGNGVKPAPGRDPLTCDREGDGEPKSAHTS 
PEKKKAKELPEMDATSSRVWILTSTL.TTSKVVI I DANQPGTWD 

LAG I TLVGCATROSrVPRSNCSSRGDTPVT^KGQGEVATIANGKV 
NPSQSTEEATEATEVPDPGPSEPETATLRPGPLTEHVFTDPAPT 
PSSG PQPGS ENGP E PDSS S TR PE PE P SGDPTGAGS SAAPTMWLG 
AQNGWL YVH S AVANWKKCLHS I KLKDSVLSLVHVKGRVLVALAD 
GTLAI FHRGEDGQWDLSNYHLMDLGHPHHSIRCMAWYDRVWCG 
YKNKVHVIQPKTMQIEKS FDAHPRRES QVRQLAW I GDGVWVS I R 
LDSTIJiLYHAHTHQHLQDVDIEPYVSKMLGTGKLGFS FVR ITAL 
LVAG SRLWVGTGNG Wl S I PL.TETWLHRGQ \ LLG \ LRANKTS P 
TSGEG\ARPGG\ I IHVYG\DDSSDRAARSFI P YCSMAQAQLCFH 
GHRDAVKFFVS VPGNVLATLNGSVLDS P AEGPGPAAPASEVEGQ 
KLRNYLVLSGGEG YI DFRI GDGEDDETEEGAGpMSQVKPVLSKA 
ERSHI I VWQVSYTPE 


6842 


3 


926 


RCQQLSATILTDHQYLERTPLCAILKQKAPQQYRI RAKLRS YKP 
RRLFQSVKLHCPKCHLLQEVPHEGDLDI I FQDGAT KTP D VKLQN 
TSLYDS KI WTTKNQKGRKVAVHFVKNNG ILPLSNE CLLL I EGGT 
LSEICKLSNKFNSVI PVRSGHEDLELLDLSAPFLI QGTVHHYGC 
KQWST*RSIQNLNSLVDKTSWIPSSVAEALGIVPLQYVFVMTFT 
LDDGTGVLEAYLMDSDKPFQI PASEVLMDDDLQKSVDMIMDMFC 
P PG I KI DAYPWLECFI KS YNVTNGTDNQ I CY Q I FDTT VAEDVI 


6843 


2 


851 ; 


NHRKVLSGAKRYE CNECGKS FAYTSSLI KHRRIHTGERPYECSE 
CGRSFAENSSLIKHLRVHTGERPYECVECGKSFRRSSSLLQHQR 
VHTPvERPYECSECGKSFSLRSr^IHHQRVHTGERHECGQCGKSF 
SRKSSLI XHLRVHTGERPYECSDCGKSFAENSSLIKHLRVHTGE 
RP YECIDCGKS FRHS SSFRRHQRVHTGMRPYK*SKFWKFSCPG F 
LLLQGQRVHTGSRCYECDKWGI FFS*NASFFT+ KSAPTEEVPFE 
CNECEKAFS PLSLVTTXFT 


6844 


244 


642 


EHQLAGFELRKTQTSMSLGTTREKTDRVKSTAYLSPQELEDVFY 
QYDVKS E I YS FGI VLWE I ATGD I P FQGCNS EKIRKLVAVKRQQE 
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Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H^Histidine, I=Isoleucine, K=Lysine, 
L=Leucine. M=Methionine, N=Asparagine, 
P= Proline , Q=Glut amine , R=Arginine, 
S -Serine , T=Threonine , V=Val ine , 
W= Tryptophan, Y=Tyrosine, X= Unknown, +<=Stop 
Codon, /^possible nucleotide deletion, 
\ ■» possible nucleotide insertion) 








PLGEDCPSEIiREIIDECRAHDPSVRPSVDEII,Kia>STFSK*CIK 
T 


6845 


3 


1519 


VAVRDECY WRHVFWDQDLWMLLFI LMCHPETARARLE YRI RTLD 
GALENAQN1X5YQGAKFAWESADSGLEVCPEDIYGVQEVHVNGAV 
GLAFEiYYHTTQDLQLFREAGGWDVVRAYAEFWCSRVEWSPRBE 
KYHLRGVMSPDEYHSGVNNSVYTJiVLiVQNSLRFAAALAQDLGLP 
I PS QWLAVADKI KVP FDVEQNFHP EFDGYEPGEVVKQADVVLLG 
YPVPFSLSPDVRRKNLE I YEAVTS PQGPAMTWS MFAVGWME1«KD 
AVRARG LliDRS FANMAEPFKVMTENADGSGAVNFTjTGMGG FLQA 

S EDSVTVEVTARAGPWAPHLEAELWPSQSRIiSLLPGHKVS FPRS 
AGRIQMSPPKLPGSSSS EFPGRTFSDVRDPLQS PLWVTLGSSSP 
TESLTVDPASE^SGTGASETSLGPSLWPRLHPPLbGTIJLACHPS 
PAARLSGKVHAAWPEFKAFCL 


6846 


213 


1258 

• 


LYFLKTIK*LNRLAEHP*YENEKLTKLRNTIMEQYTRTEKSARG 
1 1 FTKTRQSAYALSQW I TENEK FAE VG VKAHHL. I GAGHS SEFKP 
MTQNEQKEVISKFRTGKINL,LIATTVAEEGI>DIKECNIVIRYGL 
VTNE IAMVQARGRARADESTYVLVAHSGSGV I EHETVN D FREKM 
PaYKAIHCVQNMKPEE YAHKI LELQMQS II^KKMKTKRNIAKHYK 
NNPSLITFLCKNCSVIiACSGEDIK^IEKMHHVNMTPEFKEliYIV 
RENKTLQKKCADYQINGEI I CKCGQAWGTMMVHKGLDL P CIiKIR 
NFVVVFKmSTKKQYKKWVELPITFPNLDYSECCLFSDED 


6847 


1450 


34 8 


SMCWNSDRLEMPLIDLAblLYPPSYVPYTGHLSDDSLSRKYCLT 
WFEDALNGVL* RAEAIQPHCVNAGDRMEKFRQKYWNKLQTLRQQ 
V rAi v» fblvKa LLiUlRhn OJjN E F N r P JJ P x o K. V K n. N L» VAL»R£Ir 
PGVVRSLDALGWEERQIiALVKGIJAGWFDWGAKAVSAVLESDP 
YFGFEEAKRKLQERPWLVDS YSEWLQRLKGPPHKCAL1 FADNSG 
IDI I LGVFPFVREiJjIiRGTEVIIiACNSGPAljNDVTHSES LI V7AE 
R I AGMD PWHS AI»REERliI*IiVQTGS S S PCLDLS RIiDKGLAAL VR 
ERGADLWIEGMGRAVHTNYHAALRCESLEOiAVI KNAWLAERLG 


684B 


19 


16 


AMWWNSUX5IRNIVLSNPKKRin , LSIJU4I^LQSDILHDADSND 
LKVI IISAEGPVFSSGHDLKEIiTEEQGRDYHAEVFQTCSKVMMH 
I RNHP VP VI AM VNGLATAAGCQIiVAS CD I AVAS DKS S FAT PG VN 
VGLFCSTPGVALARAVPRKVALE^FTGEPISAQE 
WPEAELOEETMRrARKIASLSRPWSLGKATFYKQLPQDLGTA 
YYLTSQAMVDNLALRDGQEGITAFLQKRKPVWSHEPV*VEH 


6849 


70 


821 


SL»GVDGSCLEQGSPAPRPQTDTSP*PVGNWATQQEDbYHQSYEC 
VCVLFASVPDFZEFYSESNI NHEGLECIiRLLNE I IADFDEKLS K 
PKFS GVE KIKTIGS TYMAATGLNATS GQDAQQD AERSCS HLtGTM 1 
VEFAVAlX5SKIJ)VINKHSFNITFRLRVGLNHGPVVAGVIGAOKPO 
YDIWGNTVNVASRMESTGVLGKIQVTEETAWALQSLGYTCYSRG 
VIKVKGKGQIiCTYFLNTDLTRTGPPSATLG 


6850 


2 


1235 


ARGLNHEWTFEKLRQHISRNAQDKQEIiHIjFMIjSGVPDAVFDLTD 

LDVLKI^IPEAK^PAKISQMTNLQELKLCHCPAKVEQTAFSFL 

RDHIJICLHVKFTDVAEIPAWVYI^KNL^ 

GI^SLREIjRHLKILHVKSNLTKVPSNITOVA^ 

KLLVI^SLKKMMNVAELELQNCEIiERI PHAIFSI»SNLQEI»DLKS 

NNIRTIEEIISFQHLKRJjTCIiKLWHNKIVTIPPSITlIVKNLESL 

YFSNNKLESLPVAVFSI^KLRCLDVSYNNISMI PIEIGLLQNLQ 

HLHITGNKVDILPKQLFKCIKI^TLNLGQNCITSltPEKVGQIiSQ 

LTIQLELKGNCIJDRLPAQI^QCRMLKKSGLVVEDHLFDTLPLEVK 

EALNQDINIPFANGI 


6851 


1765 


660 


VSAQVSAREGENCLGWNLADSSQESYKSliEEAEDCYPPSLLTLD 
IiRDLFNQVEQGPLLSCPKAGTDLSMGRAREVGWMAAGliMIGAGA 
CYCVYKLTIGRDDSEKLEEEGEEEWDDDQELDEEEPDIWFDFET 
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Amino acid segment containing signal peptide 
(A= Alanine , C=Cysteine , D=Aspartic Acid, E=s 
Glutamic Acid, P=Phenyl alanine, G=Glycine, 
H=Histidine, l = lsoleucine, K=I>ysine, 
L=Leucine / M=Methionine, N=Asparagine . 
P=Proline, Q=Glutamine, R=Arginine, 
S= Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y= Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








MARPWTBDGDWTBPGAPGGTEDRPSGGGKANRAHPIKQRPFPYE 
HKNTWSAQNCKNGSCVLDLSKCLFIQGKIjLFAEPKDAGFP fsqd 
INSHIJ\SI>SMARKTSPTPDPTWEALCAPIWLNASIESCK3QIKM 
YINEVCRETVSRCCNS FLQQAGLNLLISMTVINNMLAKSASDLK 
FPLI SEGSGCAKVQVLKPLMGLSEKPVLAGELVGAQMLFS FMSL 
P I RNGNRB I LLETPAP 


6852 


1 


407 


RTRG EETYANFI KHNDGKN1 FYAART PATLFAVMFAMY 1 1 SGLT 
GFIGLNS IAVXiCNIjVMGLAIjI FLCTWAYVKYSGEFREIGTVTDQ 
IAETLWEQVIiKPLGDNliMEENIRQSVTNS I KAGLTDQVS HHARL 
KTD 


6853 


3 


469 


GDS CAVCIEJLYKPNDLVRIIiTCNHI FHKTCVDPWLLEHRTCPMC 
KCD ILKALG I EVDVEDGSVSLQVPVSNEI PNS ASSHEEDNRSET 
AS SG YASVQG TYEP PI*EEHVQSTNE SI^LVNHEANS VAVDVI PH 
VDNPTFEEDETPNQETAVREI KS 


6854 


1148 


585 


hes y i gtfdpg elcvcaai qwlqdn s as yfiinrklvye p s tqak 
pvkntflrmwi yshhi yqqdlrkki 1jdvgkrldvtgfcmtgkpg 
1 1 c^egfkehceefwhtirypnmkhisckhaesvetegngedlr 
lfhs fefj j lleahgdygiju^yhmnix;qfi^fi.kkhksehvfqi 
lfgieskssds 


6 855 


1913 


1148 


GRVGGRVGR I CS PbSGANE Y lASTDTLiKTEE VLL.FTDQTDDLAK 
EEPTSIiFQRDS ETKGESGI>VIiEGDKE I HQI FEDLiDKKLAIiASRF 
YI PEGCIQRWAAEMWAiiDALHREG I VCOTLNPNNXLLNDRGHI 
QliTYFSRWSEVEDSCDSDAIERMYCAPEVGAI TEETEACDWWSL 
GAVL FELI/TG KTLVECHPAG I NTHTTliNMPEWVSEEARSL. IQQL 
LQFNPLERLGAGVAGVED I KSHPFFTPVDWAELMR 


6856 


1617 


■ 997 


VTQLYVSVDASTKDSLKKIDRPLFKDFWQQFLDSLKALAVKQQR 
TVYRIiTLVKAWNVDELQAYAQLVSIiGNPDFI EVKGVTYCGESS A 
S S LTMAHVP WH EE WQ FVR ELVD L I PEYEIACEHEHSNCLLIAH 
RKFKIGGEWWTWINYNRFQELIQEYEDSGGSKTFSAKDYMARTP 
HWAL FGAS ERGFDPKDTRHQRKNKS KAI SGC 


6857 


1 


617 


KGPEATAMVCVCSHPNCRQNHIKPSHSAAQTWCGSPTPASAPNH 
KI^AMEQGKTLPSATEDAKEEGLEAQ ISRLAEL IGRLESKALWF 
DLOXJPXSDEIX3TNMHIjQI»VRQEMAVCPEQLiS E FI*DSI*RQ YIjRGT 
TGVRNCFHI TAVRIjSDG FTFVI YEFWETEEAWKRHLQS PLCXAF 
RHVKVDTLSQP EAIiSRILVPAAWCTVGRD 


6858 


•2 


669 


RSRG I KDFEND P PLSSCG I FQSRIAGDALIiDSG I RI SS VFAS PA i 

LRCVQTAKLIIjEELKLEKKIKIRVEPGIFEWTKWEAGKTTPTLM 

SLEELKEANFNIDTDYRPAFPLSALMPAESYQEYMDRCTASMVQ 

IVNTCPQDTGVILIVSHGSTLDSCTRPLLGLPPRECGDFAQLVR 

KI PS LGMCFCEENKEEGKWELVNP PVKTLTHG ANAAFNWRNW I s j 

GN 


6859 


1 


1150 


GETMFKKAKTKAKKKPRKRSDSSGGYNLSDIIQSPSSTGLLKSG 
KTNSVESLPEI^TSDSEGSYAGVGSPRDLQSPDFTTGFHSDKIE 
AKVKP YVNGTS PVYSREDLKPWEKSP I bKISAPQP I PSNRIDTT 
SSASWVAGSFS PVSPPWDLRTIME I EESRQKCGATP KSHLG KT 
VSHGVKLSQKQRKMIALTTKiamSGMNSMETVLFTPSKAPKPVN 
AWASSLHSVSSKS FRDFLLEEKKSVTSHSSGDHVKKVSFKG I EN 
SQAPKIVRCSTHGTPGPEGNHISDLPLLDSPNPVJLSSSVTAPSM 
VAPVTFAS I VEEEliQQEAAIi IRSREKPIAIjI Q I EEHAI QDLLVF 
YEAFGNPEEFVIVERTPQGPIiAVPMWNKHGC 


6860 


1889 


1515 


DKDKKRQKKRG I FPKVATN IMRAWLFQHLTHP YPSEEQKKQLAQ 
DTGLT I LQVNNWF INARRI I VQ PMI DQSNRAVS QGAAYSPEGQP 
MGSFVLDGQQHMGIRPAGPMSGMGMNMGMDGQWHYM 


6861 


1889 


1515 


DKDKKRQKKRGIFPKVATNIMRAWIiFQHLTHPYPSEEQKKQLAQ 
DTGLTILQVNNWFINARRIIVQPMIDQSNRAVSQGAAYSPEGQP 
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Amino acid segment containing siyiuil peptide 
(A=Alanine, C= Cysteine , D-Aspartic Acid, E« 
Glutamic Acid, F=Phenylalanine / G=Glycine, 
H=Histidine, I=Isoleucine, K=I*ysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P= Proline, Q^Glutamine , R=Arginine, 
S -Serine , T= Threonine , V=Val ine , 
W=Tryptophan, Y -Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








MGS FVIJDGQQHMG I RPAG PMSGMGMNMGMDGQWHYM 


6862 


2 


471 


EE I DREFHNKLKLKEDKLEKQE KPVNGEDKGDSGVDTQNSEGNA 
DEED PLG PNCYYDKTKS FFDN I S CDDNPJ5RRPTWAEERRLNAET 
FGI PLRPNRGRGGYRGRGGIiGFRGGRGRGGGRGGTPTAPRGFRG 
GFRGGRGGRE FAD FE YRKTTAFG P 


6863 


2216 


487 


PQEPALKSEFSQVASNTIPLPLPQPNTCKDNGPCKQVCSTVGGS 
A T r c C P Dfi Y AT MADGVS CEDODECIiMGAHDCS RRQFCVNTLiGS F 
YCVNHTATLCADGYIIiNAHRKCVDINECVTDIiHTCSRGEHCVT^L 
GSFHCxKALTCEPGYALKDGECEDVDECAMGTHTOQPGFLiCQNT 
KGS FYCQARQRCMDGFLQDPEGNCVDINECTS LSEPCRPG FSCI 
NTVGS YTCQRNPLI CARGYHASDDGT1CCVDVNECETCVHRCGEG 
QVCHNLPGSYRCDCKAGFQRDAFGRGCIDVNECWASPGRLCQHT 
CENTl^SYRCSCASGFIjI*AAIX3KRCEI?VNECEAQRCSQECANIY 
GS YQCYCRG<3YQLAEDGHTCTDI DE CAQGAG I LCTFRCT/NVPGS 
YQCACPEQGYTMTANGRSCKDVDECALGTHNCSEAETCmiQGS 
FRCLR FECP PNYVQ VS KTKCERTTCHDFliECQNS PAR I THYQLN 
PQTG LLi V PAH I FRIG PAP AFTGDTIALNI IKGNEEGY FGTRRLN 
A YTG WYIjQRAVLE P RD FAliD VEM KL WRQGS VTT FLAKMHI FFT 
TFAL 


6864 


2 


2933 


IiADSS PSNLQI I IKELLSMHHQPDPALTKEFDYLPP VDSRS SSG 

fvglrnggatcymnavfo^lymqpglpesllgvdddtdnpddsv 
fyqvqslfghiimesklqyyvpewfwkifkmwnkelya^o^day 

EFFTSLIDQMDEYLKKMGRDQIFKNTFQGIYSDQKI ckdcphry 
EREEAFMAIjNIjGVTSCQS 1*E I SIjDQ FVRGEVIjEG SNAY Y CE KCK 
EKRI TVKRTCI KSLtPSVLVTHXiMRFGFDWESGRS I KYDEQ I RFP 
WMLNMEPYTVSGMARQDSSSEVGE^GRSVDCCGGGSPRKKVAL.T 
ENYELVGVIVHSGQAHAGHY^SFIKDRRGCGKGKVnTKFlTOTVIE 
EFDLNDETLEYECFGGEYRPKVYDQTNPYTDVRRRY>/NAYMLFT 
QRVSDQNSPVLP KKSRVSWRQEAEDliSLSAPSS PE ISPQS SPR 
DiroiJTJMnPT.QTT.TKlA/inCGEKKGIjPVEKMPARIYOMVRDENIiKF 
MKNRDVYSSDYFS FVLSIASliNATKLKHP YYPCMAKVS I/QLAI Q 

FLFQTYXjRTKKKIjRVDTEEW iatieallsksfdacqwlveyfi s 

SEGRELIKIFLLECNVREVRVAVATIIiEKTLDSALFYQDKLKSL 
HQL1»EVIjLAI^DKI)VPENCKNCAQYFFTj Fin*FVQKQG I RAGDIiL 
LRHS ALRHM I S FLLGASRQNNQ IRRWS S AQAREFGNLHNTVAiL 
VLHSDVS SQRNVAPG I FKQRPP I S XAPS S PLIjPLHEEVEALI*FM 
SEGKPYLI^VMFALRELTGSLLALIEMVVYCCFCNEHFSFTMLH 
FIKNQbETAPPHELKNTFQLLHEILVIEDPIQVERVKFVFETEN 
GLLAliMHHSNHVDSSRCYQCVKFXiVTIaAQKCPAAKJSYFKENSHH 
WSWAVQWLQKKJ4SEHYimiQSirVSOTTSTGKTFQRTISAQDTLA 
YATAIiLNEKEQSGS SNGSESS PANENGDRHLQQGS E S PMMI GEL 
RSDLDDVDP 


6865 


1820 


1242 


DPERWKHLSKVTPPGSSVSTTPVQWRiQSPQSQGSMMPSCNRS 
C^CSRGPSVEDGKWYGVPsSYIjHI^FYEGYAVPPKLEGIGEGEFIjV 
LDQRAAD YIIQAIjGTCRIiAGTALCVAAGVIjIiAI CLFWAM I GWLSQ 
DTKAEPLDPEADSHVEVFGDEPEQQIjS P I FRNASGQS wfs ppas 

pfgqssvqtiqpkrds 


6866 


1571 


495 


DCPRPRYTLYGIiRATCMRDLDWAW INAVSAFKALEQDLPVNI KF 

i iegmeeagsvaleelvekekdrffsgvdyivisdnlwisqrkp 

AITYGTRGNSYFMVEVKCRTODFHSGTFGGIIJIEPMADIjVAIjI.G 
SLVDSSGHILVPGIYDEVVPLTEEEINTYKAIHIiDLEEYRNSSR 
VEKFTjFDTKEE ILMHLWRYPSLS IHGIEGAFDEPGTKTVI PGRV 
1 G KFS 1 RI» V PHMNVS AVE KQVTRHIoEDVFS KRN S SNKHVV S MTL 
GI^POANIDDTQYIiAAKRAIRTVFGTEPDMIRDGSTIPIAKMF 
QEIVHKSWLI PIjGAVDDGEHSQNEKINRWNY I EGTKLFAAF FI* 
EMAQLH 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponds ng 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corre sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A^ Alanine , C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine , V=Valine, 
W=Tryptophan, Y=Tyrosine, X= Unknown, *«=Shop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 


6867 


2833 


1704 


GTRIMSQPKQKELAGFVRQKMLLDYSVYMGRCVPQESRSPQRSP 
LQSAESSPTAGKKIjPEVPPSEEEEQEAWVNALIiGRIFWDFLiGEK 
YWSDLVSKKI QMKl^KIKI^PYFMNELTLTELDMGVAVPKILQAF 
KP YVDHQGLW I DL.EMS YNGS FLMTLETKMNLTICLGKE PLVEAL.K 
VGE I G KEGCRPRAFCLADS DEESS S AGSSEEDDAPEPSGGDKQIj 
LPGAEGYVGGHRTSK1MRFVDKITKSKYFQKATETEFIKKKIEE v 
VS NTP LLLTVE VQ E CRGTLA VN I P P P PTDR VWYGFRKP PHVELK 
ARP KLGEREVTLVHVTDW I E KKLEQEFQKVFVM PNMDDVYI TIM 
HSAMDPRSTSCLLKDPPVEAADQP 


6868 


1 


346 


RPTR PPTRPEE IKNL.ILP YI SDMNFVQDLCEDFYELFKTDKGFD 
KATFESQMS VMRGQXLNI/TQAIjRDGKS PFQLVQ I PCVI VERS QG 
GSQGR I VHLSNS FTQTVNCRKPFFSSW 


6869 


3 


1619 


MYMERMDKRALI S FWES VEHLiKNANKNEI PQLVGEI YQNFFVES 
KE I S VEKSLYKE I QQCLVGNKG I EVFYKI QEDVYETL»KDR YY PS 
FTVSDLYEKLLIKJiEEKHASQMISNKDEMGPRDEAGEEAVDDGT 
NQINEQASFAVNIOjRELNEKIjEYKRQALNSIQNAPKPDKKIVSK 
LKDEI ILIEKERTDLQLHMARTDWWCENLGMWKAS ITSGEVTEE 
NGEQLPCYFVMVS lqevggvetknwtvpkrlse FHNLHRKLSEC 
VPSL.KKDQLPSLSKX.PFKS IDHTFMEKFENQLNKFLQNliLSDER 
L CQS EALYAFL S PS PDYLKVTDVQGKKNSFSLSS FLERIiPRDPF 
SHOEEETEEDSDL»SDYGDDVDGRKDALAEPCFMLIGEIFEI>RGM . 
FKWVRRTIilALVQVTFGRT INKQIRDTVSWI FSEQML.VYY INI F 
RDAFWPNGKIAP PTTIRS KEQSQETKQRAQQKLLEN I PDMLQS L» 
VGQQNARHGI I KI FNALQETRANKHLLYALMELLL.IEL.CPELRV 
HLDQLKAGQV 


6870 

• 


1 


1566 


MAAWAATRWWQLLX.VLS AAGMGASGAPQ PPNI LLLLMDDMGWG 
DI^ VYGEPSRETPN1JDRMAAEGIJjFPNFYSANPIjCSPSRAAL.LT 
GRLP I RNGF YTTNAHARNAYTPQE I VGG I PDSEQLLPELL»KKAG 
YVSKIVGKWHLGHRPQFHPLKHGFDEWFGSPNCHFGPYDNKARP 
NIPVYRDWFjyiVGRYYEEFPINLKTGEANLTQIYLQEALiDFI KRQ 
ARHHPFFLYWAVDATHAP VYASKPFLGTSQRGR YGDAVRE I DDS 
IGKILELLQDLHVADNTFVFFTSDNGAALISAPEQGGSNGPFL.C 
GKQTTFEGGMREPALiAWWPGHVTAGQVSHQLGS IMDLFTTSLAJL. 
AGLTPPSDRALDGXJnTLLPTLLQGRLMDRPI FYYRGDTLMAATLG 
QHKAHFWTWTNSWENFRQG IDFCTCQNVSGVTTHNLEDHTKIiPIi 
IFHLGRDPGERFPL.SFASAEYQEALSRITSWQQHQEAI,VPAQP 
QLNVC^AVMNWAPPGCEKLGKCLTPPES I PKKCLWSH 


6871 


205 


1126 


RMSLN P P I FLKRSEENSSKFVET KQSQTTS IAS ED PLQNLCLAS 
QEVLQKAQQSGRS KCIiKCGGSRMFYCYTCYVPVENVPI EQI PLV 
KLPLKIDI I KHPNETDGKSTAIHAKLLAPEFVNIYTYPCI PEYE 
EKDHEVALiIFPGPQSISIKIJISFHIjQKRIQNNVRGKNDDPDKPS 
FKRKRTEEQEFCDLNDSKCKGTTLKKI I FIDSTWNQTNKI FTDE 
RLCOLL^VELKTRKTCFWRHQKGKPDTFLSTIEAIYYFLVDYHT 
DILKEKYRGQYDNLLFFYS FMYQLI KN AKCSGDKETG KLTH 


6872 


880 


459 


FGLLMVVLS LI FMKGNCVREDLI FNFL.FKLGLDVRETNGL.FGNT 
KXL.ITEVFVRQKYLEYRRI PYTEPAEYEFLWGPRAFLtETSKMLV 
LRFLiAKI^KKDPQSWPFHYI^LAEC^WEDTOEDEPLTTGDSAHG 
PTSRPPPR 


6873 


1929 


955 


DEQAVIiCSKDKT YDL.KIADTSNMLL.F I PGCKTPDQLKKEDSHCN 
I IHTEIFGFSNNYWEI^RRRPKLKKLKKLLMENPYEGPDSQKEK 
DSNSSKYTTEDLliDQIQASEEE IMTQLQVLNACKIGGYWR ILE F 
DYEMKIJC»NHVTQLVDSESWSFGKVPLNTCI^EI^PI»EPEEMIEH 
CLKCYGKKYVDEGEVYFELDADKI CRAAARMLLQNAVKFNLiAE F 
QEVWQQSVTEGMVTSIJX^LKGLALVDRHSRPEIIFLLKVDDL^PE 
DNQERFNSLFSIJU2KWTEEDIAPYIQDLCGEKQTIGAL.LTKYSH 
SSMQNGVKVYNSRRP IS 
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ID 

NO: 


rieaiCt cu 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


tr x. w ^ w 

nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid 5? ecnnsnt containiflQ sional peptide 
(A=Alanine , C= Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenyl alanine, G=Glycine, 
H-Histidine, l=lsoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P= Proline , Q=Glut amine,. R=Arginine, 
S=Serine, T=Thxeonine, V= Valine , 
W.= Tryptophan, Y=Tyrosine , X=Unknown , *=Stop 
Codon, /---possible nucleotide deletion, 
\=possible nucleotide insertion) 


6874 


1 


307 


DSIADHVNSAAVNVEEGTKNLGKAAXYKLAAIjPVAGAI* 

GPIGLIiAGFKVAGIAAALGGGVIXSFTGGKLTQRKKQKM^ 

SCPDLPSQTDKKCS 


6875 


1688 


349 


VIGTGERGNSASEKWBIMFNEELGDPFI I1HS ISLI^NAEEHSIA 
TT »TtT >R I EKBELDMICGSG FYVSLEVrVT I SKKNQDNKKYE 1 1 KRD I 
LRGKS VPHYAAI EPDGNGLMI VS YKSLTFVQAGQDLEENMDEDI 
SEKI KEPL YTOQO/TEDDLTVTI RLPEDNTKED I Q I QFLPDHINI 
VLKDHQFLEGKLYSS IDHESSTWI I KESNSLEI SLI KKNEGIiTW 
PELVIGDKQGELIRDSAQCAAIAERLMHLTSEEIiNPNPDKEKPP 
CNAQELEECD I FFEESSSI^FDGNTIJCTTHVVNLGSNQYIjFSV 
rVDPKEMPCTCLRHDVDALLWQPHSSKQDDMWEHIATFNALGYV 
citp n \ne c tr & fa. pmv <j yjx at .PRn ,rp vf 7 YR OP APMSTVXjYNR 
KEGRQVGQVAKQQVASLETNDPI LGFQATNBRIiFVLTTKNIjFLI 
KVNTEN 


6876 


41 


1285 


VGEMTIiI WRHJjTjRPIiCIiVTS A PR I LEMHP FLS LGTSRTS VTKL»S 

LHTKPRMPPOTFMPERYQVIFLVNSGSEANELAMI^^ 

DI ISFRGAYHGCSPYTLGLTNVGIYKMELPGGTGCQPTMCPDVF 

VAKSIAGFFAEPIQGVNGVVQYPKGFTjKEAFELVRARGGVCIAN 
EVQTGFGRLG SHFWG FQTHD VLPD I VTMAKG I GNG FPMAA VITT 
PE IAKS XuAKCJjQHFNTFGGNPMACAI GS AVLE VIKEENIiQENSQ 
EVGTYMI^KFAKLRDEFEIVGDWGKGI^IGIEMVQDKISCRPL 
PREEVNQIHEDCKHMGLLVGRGS I FSQTFRIAPSMCITKPEVDF 


6877 


1 


778 


GTS PS PARAYAP PTERKRFYQNVS I TQGEGGFBINLDHRKIjKTP 
QAKLFTVp S EALAIAVATE WDS QQDT I KYYTMHLTTX.CNTSLDN 
PTQRNKDQIj I RAAVKFLDTDT I CYRVEEPETLVEIjQRNEWDP 1 1 
EWAEKRYGVE1 SSSTS IMG PS I PAKTREVLVSHIAS YNTWALQG 
IEFVAAQTjKSMVLTLGLIDIiRIjTVEQAVLLSRIiEEEYQ I QKWGN 
IEWAHDYEI^ELRARTAAGTLFIHLC^F^TTVKHiU J l»KE 


b o / a 






OTLOGDFKNRAEMI DFWI RI KNVTRSDAGKYRCEVSAPSEQGQN 
I£EDTVTI J EVI.VAPAVPSCEWSSAI«SGTVVELRCQDKEGNPAP 
EYTW FKDG I RLLEN PRIjGSQSTNS S YTMNTKTGTtrQFNTVS KLD 
TGEYSCEARNSVGYRRCPGKRMQVDDIjNISGI IAAWWALVIS 
VCGJjGVCYAQRKG YFS KETS FQKSNS S S KATTMS ENDFKHTKS F 
II 


6 879 


3 


845 


IRVIGESDIMQEFI*SESDENYNGVSDVEI»RVALPDGTTVTVRVK 
KNSTTDQVYQAI AAKVGMDSTTVNYFALFEVI SHSFVRKLAPNB 
FPHKLYIQNYTSAVPGTCLTI RKWLFTT EBE I LIiNDNDLAVTYF 
FHQAVDDVKKGY I KAEEKS YQI^KL YEQRKMVM YLNMLRT CEGY 
NEIIFPHCACDSRRKGHVI TAJS ITHFKLHACTEEGQLENQVIA 
FEWDEMQRWDTDEEGMAFCFEYARGEKKPRWVKI FTPYFNYMHE 
CFERVFCELKWRKEEY 


6880 


2110 


1437 


RKDN CTAKEWT F P EAKWN TT AR V FS H I RLGMGHVL I IVQCFI SS 
MANIYNEKILXEGNQLTES IFIQNSKLYFFGILFNGLTLGLQRS 
NRDQI KNCGFFYGHRAFSVALI FVTAFQGLSVAF I LKFLDNM FH 
VLMAQVTTVI I TTVS VLVFDFRPSLE FFLEAPSVLI»S IFIYNAS 
KPQVPE YAJ?RQERI RDI^GNLWERSSGDGEEIjERLTKPKS DES D 
EDTF 


6881 


2638 


2244 


NDSKWEDI HVIltSALJO^FFRELPEPIjJe^FNHF^roFVNAI KQEPR 
QRVAAVKDL IRQLP KPNQDTM Q I L PRHL.RR V I ENG E KNRMT YQ S 
IA I VFG PTLLKP E KETGN I AVHTVYQNQ IVEL. I LLELS S I FG R 


6882 


1 


850 


GIPEAQIiWIYPVKSCKGVPVSEAECTAMGlJ^GNLRDRFWIiVIN 
QEGNMVTARQEPRLVLISLTCTODTLTI^AAYTKDLLI^IKTPT 
TNAVHKCRVHGIiE IEGRDCGEATAQWI TS FLKSQP YRLVHFE PH 
MRPRRPHQIADI.FRPKDQIAYSDTS PFLI LSEASLADLNSFJUEK 
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SBO 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D^Aspartic Acid. B=» 
Glutamic Acid, F= Phenyl alanine, G=Glycine , 
H«Hist idine , I = Isoleucine , K= Lysine , 
L= Leu cine, M=Methionine, N=Asparagine, 
P= Proline , Q=Glut amine , R=Arginine, 
S=Serine, T=Threonine , V=Valine, 
WtsTryptophan, Y= Tyrosine, X=Dnknown # *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








KVKATN FRPN I VI SGCDVYAEDS WDBUblGDVEEiKRVMACSRCI 
LTTVDPDTGVMSRKEPLETLKSYRQCTPSERKLYGKS PLFGQYF 
VLENPGTIKVGDPVYLLGQ 


! 6883 


2794 


2256 


NSKLKLNQNLKLFI TLTYQVLS LHGWGPGIHLQKEGAFP VTQNR 
ALQLLYDLRYLNIVLTAKGDBVKSGRSKPDSRI EKVTDHLEALI 
DPFDLDVFTPHI^SNIJIRLVQRTSVLFGLVTGTENQLiAPRSS'rF 
NSQEPHNILPLASSQIRFGIjrjPLSMTSTRKAKSTRNIETKAQYD 
ANC 


6884 


2 


99 


BFERVTAEAVKPRETSEPRAAAQRFCEKFPFL 


6885 


297 


1554 


STGQFWHVTDLHUDPTYH ITDDHTKVCASSKGANASNPG PFGDV 
LCDS PYQL ILS AFDFI KNSGQFAS FMI WTGDS PPHVPVP ELSTD 
TVINVITNMTTTIQSLFPNLQVFPAI^NHDYWPQDQLSVVTSKV 
YNAVANLWKP WLD E EA I S TIJ^GG FYSQ KVTTNPNLR IIS LNTN 
LYYGPNIMTLNKTDPANQFEWIiESTLNNSQQNKEKVYI I AHVPV 
G YLP S SQNITAMREYYNE KL XD I FQKYSDVIAGQF YGHTHRDS I 
MVLSDKKGSPVNSLFVAPAVTPVKSVLEKQTNNPGIRLFQYDPR 
DYKLLDMLQYVLNLTEANLKGES I W KLE Y ILTQTYD I EDLQPES 
T .Yf? T .ATf OFTTLDQKOF I KYYNYFFVS YDSSVTCDKTCKAFQ I CA 
IMNLDNISYADCLKQLYIKHNY 


6886 


2 


1341 


QCGGI PGREGG S SR PLEEGTGS SPACVRGAAPGS EDAFYPTRAK 
QARVSQELKKAAKRTVS ISEGPDHjGDGMRERRETLALAPEPEP 
LEKEACEKWKRPFRSASATSLTLSHCVDVVKGI>LDFKKRRGHSI 
GGAPEQRYQI I PVCVAARLPTRAQDVLDAHLSEVNAVRFGPNSS 
LIiATGGADRLIHLWNWGSRLEANQTLEGAGGSITSVDFDPSGY 
Q VIiAAT YNQAAQL WKVG EAQ S KKTLS GHKXJKVTAAKFKLTRHQA 
VTGSRDRTVKEWDLGRAYCSRTINVLSYCNDVVCGDHI I ISGHN 
DQKIRFWDSRGPHCTQVIPVQGRVTSLSLSHDQLHLLSCSRDNT 

LKVIDLRVSNIRQVFRADGFKCGSDWTKAVFS PDRS YALAGS CD 
GALYIWDVDTGKLESRiQGPHCAAVNAVAWCYSGSHMVSVDQGR 

KWLWQ 


6887 


104 7 


116 


' WTARPS QKP FWEAGAVPGDPLSTGCS QAQLGGCC PRGP WGPQHG 
GQQRAAGPTLPRGERGGPQQSGPGLAAQTPPTSKQVAWRAFLTG 
TYRSQS PRS PAGP FRGGTGWW PE PAVCLCVAVGPQRLSS PGLVY 
NASG S EHCYD IYRL YHS CAD PTGCGTG PD ARAWD YQACTE INLT 
FASNNVTDMFPDLPFTDEiLRQRYCIiDTWGVWPRPDWLIjTSFWGG 
DLRAASNIIFSNG1TLDPWAGGGIRRNLSASVIAVTIQGGAHHLD 
LRASHPEDPASWEARKLEAT I IGEWVKAARREQQPAIiRGGPRL 
SL ! 


6888 


1 


992 


FVAYVKKEI PH IWTHCLLNPHALVI KTLPTKLRDALFTWRVI 
NFI KGRAPNHRLFQAFFEE I G I E Y£ VLLFHTEMRWLS RGQILTH 
IFE>TYEEINQFLHHKSSNLVDGFENKEFKIHLAYLADLFKHLNE 
LSASMQRTGMNTVSAREKLSAFVRKFPFWQKKIEKRNFTNFPFL 
EE 1 1 VSDNEG I FIAAE I TLHLQQLSNFFHG YFS I GDLNEAS KW I 
LDPFLFNI DFVDDSYLMKNDLAELRASGQILMEFETMKLEDFWC 
AQ FTAFPNLAKTALE I LMPFATTYLCELGFS ITFTFQNKVPEAA 
LILSDDIRVAISKKVPSFLGHH 


6889 


1 


1534 


LTLENQI KE EREQDNS ES PNGRTS PLVSQNNEQGSTLRDUjTTT 
AGKLRVGSTDAG IAFAP VYSMGAPSSKSGRTMPN I LDD I IASW 
ENKI PPSKTS KINVKPEIjKEEPEES I ISAVDENNKLYSDI PHSW 
I CE KHI LWLKDYKNS SNWKLFKECWKQGQ PAWSGVHKKMNI SL 
WKAES I SLDFGDHQADLLNCKDS 1 1 SNANVKE FWDGFEBVSKRQ 
KNKSGETVVLKLKDWPSGEDFKTMMPARYEDLI.KSLPLPEYCNP 
EGKTNIJ^HLPGFFVRPDUSPRLCSAYGVVAAKDHDIGTTNIaHI 
EVS D WN I LVYVG IAKGNG I I*S KAG I LKKFE EEDLDD I LRKRLK 
DSSEI PGALWHI YAG KDVDKI RE FLQKI S KEQGLEVL PEHDP I R 
DQS WYVNKKLRQRLLEEYG VRTWTL I QFLGDAI VLPAGALHQ VQ 
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SEQ 
ID 
NO: 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to f i rst 

amino acid 
residue of 
amino acid 
sequence 


Predicted enci 
nucleotide 
location 
corre spond i ng 
to first 
Amino cicid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A= Alanine , OCysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L= Leucine, M=Methionine, N^Asparagine, 
psProline, Q=Glut amine , R=Arginine , 
S=Serine, T=Threonine, V= Valine, 
W= Tryptophan , Y ^Tyrosine, X= Unknown , +=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








NFHSCIQVTEDFVSPEHLVESFHLTQKLRLLKKEINYDDKLQVK 
NII»YHAVKEMVRAI»KIHEDEVDDMEEN 


6890 


3 


667 


THACGMW I P L.YLHRALWHKTAETCNS P P CGAXDS L I PGAITCF 

TGFIiGVDTGAGATRWCIUiKTQRADPLVCAVGW 

AAKSS I VGAYI CI FVGETLLFSNWAI TAD I LMYWI PTRRATAV 

ALQS FTSHLLGDAGS PYLIGFI SDLIRQSTKDSPLWEFLSLGYA 

LMIiCPFVVVLGGMFFlATALFFVSDRARAEQQVNQIiAMPPASVK 

V 




1 Q Q ft 


J- £. a jz. 


t.ot HO FT .T »S KF I »lCL»LI?n TTI E 1 1 H I GLAAG KEOFMODASNVMO 
LLLKTQSHLYIWEDNNPEVRQAAAYGI^V^^QFGGDDYP^LCSE 
AVPLLVKVI KRAHSKTKKNVIATENCISAIGKILKFKPNCVNVD 
E VLPHWLS WL PLHEDKEEAIQTLS FLCDL I ES NHP WIG PNNSN 
LPKI I S I TJ^GKINETINYEDPCAKRLANVWQVO/TSEDLWLEC 
VSQLDDEQQEALQELLNFA 


6B92 


3 


876 


RSVAAASGPGAWGTDHYCI^UJRKRDYEGYLCSIiLLPAESRSSV 

p ft I iKA H TSt ur.l if+\} V |\JJ>> Vagi JvX .L l:r lii Y lrvt Y ll r j.r ri jVIv x vtui lUUNtrrn 

OPVAI ELWKAVKRHNLTKRWLMKIVDEREKNLDDKAYRNIKPXE 
NYAENTQSSLLYLTLE I LG I KDLHADHAASH I GKAQG I VTCLRA 

SQAHLHLJCHARSFHKTWVTG^PAFl^TVS 
IFHPSI^QKNTLLPLYDYIQSWRKTY 


6893 


1 


842 


DGERKSMS VERTFS E INKAEEQYSL CQEIjCSEIiAQDLiQ KE RLKG 
RTVTI KLKNVNFEVKTRASTVSSVVSTAEEI FAIAKBLLKTEID 
ADFPHPLRLRLMGVRISS FPNEEDRXHQQRS I IGFLQAGNQALS 
ATECTIjEICTDKDKFVlCPT^MSHKKSFFDKKRSERKWSHQDTFKC 
EAVN KQS FQTSQ PFQ VLKKXMNENLE I SENS DDCQILTCP VCFR 
AQGC 1 S LEAIiNKHVDECLDG PS I SEN FKMFSCSHVS ATKVNKKE 
NVPAS S LCEKQD YEAH 


6894 


1742 


1463 


TTIjCXPLVPREHQFYETLPAEMRKFTPQYKGKSQLIjEGLPHWRG 
DVRDRGHGP^WQPSLEPSLPPTIjCFPSLSSFSSSWPSAQICLTPS 
VFNPW 


6895 


2379 


478 


VTYVELCDLJ^PTALLIMRTVIiDLIVEDLQSTSEDKEQQYTSQT 
TRLLALL YALASHKACKLAILHL INGTI KGDERYAEI FQDLLAL 
up cwn cvTPnnrvPV\rr«! t t^>«? r /dodt ali lps ssegs t S el 
EQLSNSLPNKELMTS ICDCLLATLANSESSYNCLLTCVRTMM FL 
AEHDYGLFHLKSSLRKNSSALHSLLKRWSTFSKDTGELASSFL 
EFMRQILNSDTIGCCGDDNGLMEVEGAHTSRTMS INAAELKQLL 
QSKEES PENLFLELEKLVLEHSKDDDNLDSLLDSVVGLKQMLES 
SGDPLPLSDQDVEPVLSAPESLQNLFNNRTAYVLADVMDDQLKS 
MWFTPFQAEEIDTDLDLVKVDLIELSEKCCSDFDLHSELERS FL 
SEPSSPGRTKTTKGFKLGKHKK^TFITSSGKSEYIEPAKRAHVV 
PPPRGRGRGGFG<^IRPHDIFRQRKQNTSRPPSMHVDDFVAAES 
KEWPQDGI PPPKRPLKVSQKI SSRGGFSGNRGGRGAFHSQNRE 
FTPPASKGNYSRREGTRGSSWSAQNTPRGNYNESRGGQSNFNRG 
PLPPLRPLSSTGYRPSPRDRASRGRGGLGPSWASANSGSGGSRG 
KFVS GGS GRGRHVRS FTR 


6896 


1 


555 


GN I VIQKKKYN KQH 1 1 PLENVT I DS I KDEGDLRNGWLIKTPTKS 
FAVYAATATEKSEWMNHINKCvTDLI^KSGKTPSNEHAAVWVPD 
S EATVCMRCQKAKFT P VNRRHHCRKCG FWCG PCSEKRFLLPSQ 
S S KP VKI CD FCYDLLSAGDMATCQP ARSDS YSQSLKS PLNDMS D 
DDDDDDSSD 


6897 


3 


920 


GDGI^HEVVNGU^PDWETAIQKPLCSLPAGSGNALAASLNHY 
AGYEQ VTTNEDLLTNCTLIiLCPJlLLS PMNLLS 1^ FSVL 
SLAWGFIADVDIiESEKYRJUjGEMRFTIjGTFT*RIAAI^ 
YLPVGRVGSKTPASPVVVQCGPVDAHLVPLEEPVPSHWTVVPDE 
OFVLVIiALIaHSHLGSEMFAAPKGRCAAGVMHLFYVRAGVSR^ 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine , C=Cysteine, D=Aspartic Acid, B= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K= Lysine, 
L=Leucine , M=Methionine , N=Asparagine , 
P=Proline , Q=Glutamine, R=Arginine, 
S^Serine, T-Threonine, V=Valine, 
W= Tryptophan , Y= Tyro sine , X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








LRLFLAMEKGRHMEYECPYLVYV P VVAFRLE PKDGKGVFAVDGE 
IiMVSEAVQGQVHPNYFWMVSGCVEPPPSWKPQQMPPPEEPL 


6898 


919 


346 


QKTVTAVASLLKGRQG I YTENERRMGAVI KIRFFKIMLVLI ICW 
LSNI INESLLFYIjEMQTDINGGSLKPVRTAAKTTWFIMGILNPA 
O^Fl*l^LAFYGWTGCSLGFQSPRKEIQWESLTTSAAEGAHPSPL 
MPHENPASGKVSQVGGOTSDEALSMLSEGSDASTIBIHTASESC 
NKNEGD PALPTHGDL 


6899 


120 


827 


MKVRKNNDAYbliDKNKINMDCFISCPFKKMLTTLMFSHSGILSL 
LEHGEEYTFSLPCAYARS I LTVPWVELGGKVS VNCAKTG YSAS I 
TFHTKP FYGG KLHR VTAEVKHN I TNTWCR VQG EWNSVLE FTYS 
NGETKYVDLTKl^VTKKRVRPLEKQDPFESRRiWKNVTDSl,RES 
EIDKATEHKHTLEERQRTEERHRTETGTPWKTKYFIKEGDGWVY 
HKPLWKI I PTTQPAE 


6900 


3 


451 


TEVI^SKGIHELRSSTSAIiHHAr*EESASI*LTMP%fRAALPSTHI P 
VLPGKVGESTERELLEIjRTKVSQQEQIjLQSTTEHLKNANQQKES 
MEQFI VSQLTRTHDVLKKARTNLEVRKLLHQSEAPSLS PTHHHP 
LADLVGDSWPALRFQEK 


6901 


1 


201 


DDl^ VQRLETDFKMTIjOX^S TLEQ WAAWIiDNVMMQALKP YEGRP 
SFPKAARQFLLKWS FYRYHLGFS 


6902 


2 


267 


GAPPPPPSQPPRQPPQAAPSSHPHSDIjTFNPSSAIiEGQAGAQGA 
SDMPEPSLDLLPELTNPDELLSYLDPPDLPSNSNDDLLSLFENN 


6903 


1 


149 


RINQVYRQGPTG I HI LVIDQMVQNFQDESCFI*FSTVKAESS DGI 
HULK 


6904 


464 


2092 


MEASLPVSLSCVLACGDVEGKFDILFNRVQAIQKKSGNFDLLLC 
VGNFFGSTQDAEWEE YKTGI KXAPIQTYVLGANNQETVKYFQDA 
DGCELAENITYLGRKGIFTGSSGLQIVYLSGTESLNEPVPGYSF 

spkdvsslrmnu^cttsqfkgvdilltspwpkcvgnfgnssgevd 
tkkcgsalvssiatglkpryhfaalerctyyerlpyrnhi ilqen 
aqhatrfialanvgnpekkkylyafs i vpmklmdaaelvkqppd 
vtenpyrksgqeas igkqilapveesacqfffjjlnekqgrkrss 
tgrdskss phpkqprkppqppgpcwfclas pevekhl wni gth 
cylalakgglsddhvlilp ighyqs we lsaeweevekykatl 
rrffksrgkwcwfernykshhlqlqvi pvpiscsttddi kbaf 
i tqaqeqq i elle i pehsdi kq iaqpgaayf yveldtgeklfhr 
ikknfplqfgrevxj^eailnvpdksdvmqcqiskedeetlarr 

FRKDFE PYDFTLDD 


6905 


1 


226 


VSKTGEAETITSHYLFALGVYRTLYLFNWIWRYHFEGFFDLXAI 
VAGLVQTVL YCDFFYL YITKVLKGKKLS LPA 


6906 


3 


611 


S YDDHNGHI DFI TAASNLRAKM YS IEPADRFKTKRIAGKI IPAI 
ATTTATVSGL VALEM I KVTGG YP FEAYKNWFLNLAI P I WFTET 
TEVRKTKI RNGI S FT I WDRWTVHGKED FTLLDFINAVKEKYG I E 
PTMWQGVKMLYVPVMPGmKKLKLTMHKLVTa?TTEKKYVDLTV 

SFAPDIDGDEDLPGPPVRYYFSHDTD 


6907 


2 


2228 


LRGVPVWAAGAFRFSSGEESTSHLIMSRRSQRLTRYSQGDDDGS 
SSSGGSSVAGSQSTLFKDSPLRTLKRKSSNMKRLSPAPQLGPSS 
DAHTSYYSESLVHESWFPPRSSLEELHGDANWGEDLRVRRRRGT 
GGSESS RASGLVGRKATEDFLGS S SGYS S EDDYVGYSDVDQQS S 
S S RLRS AVS RAGS LLWMVATS PGRLFRLLYWWAGTTWYRI.TTAA 
SLLDVFVLTRRFSSLKTFLWFIjLPLIJLJjTCLTYGAWYFYPYGLQ 
T FHPALVS WWAAKDS RRADEGWEARDSS PHFQAEQRVMSRVHSL 
ERRLEALJU^FSSNWQKEAMRLERLELRQGAPGQGGGGGLSHBD 
TLALtiEGLVSRREAALKEDFRRETAARIQEELSALRAEHQQDSE 
DLFKKI VRAS QE S EAR I QQLKS EWQSMTQES FQES S VKELRRLE 
DQLAGIjCX)ELAAIiALKQSSVAEEVGLLPQQIQAVRDDVESQFPA 
WISQFLARGGGGRVGLIjQREEMQAQIiRELESKILTWAEMQGKS 
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SEQ 
ID 
NO: 


Predicted 
oeginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acia 
sequence 


Predicted end 
nucleotide 

corresponding 
to first 
amino acid 
residue of 
amino acid 

aorn tpnrp 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid. F- Phenyl alanine , G=Glycine, 
H^Histidine, I=Isoleucine, K= Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P= Proline, Q=Glutamine, R=Arginine, 
SaSerine, T»Threonine, V=Valine, 
W=» Tryptophan, Y«= Tyrosine, X=Unknown, +=Stop 
Codon, /^possible nucleotide deletion, 
\«possible nucleotide insertion) 








AREAAASLSLTLQKEGVIGVTEEQVHHIVKQALQRYSEDR IGLA 
DYALESGGASVI STRCS ETYETKTALLSLFG I PLWYHSQSPRVI 
LQPDVHPGN CW AFQG P QG FAWRLSAR IRPTAVTLEHVP KALS P 
NSTISSAPKDFAI FtSFPEDlXXJEGTLIiGKi-TYDQDGEPIQTFHF 
QAPTMATYQVVELRILTNWGHPEYTCIYRFRVHGEPAH 


690B 


3 


780 


QVPSAAW3bMAVCGIX3SRliGljGSRl^l^CFGAARLLYPRFQSRG 
PQGVETCDRPQPSSKTPRIPK1YTKTGDKGFSSTFTGERRPICDD 

QVFEAVGTTDELSSAI G FALELVTEKGHTFAE ELQK1 QCTLQDV 
GSALATPCSSAREAHLKYTTFKAGPI LELEQW IDKYTSQLPPLT 
AFILPSGGKISSALHFCRAVCRRAFJtRVVPI*VQMGETJ>ANVAKF 
LNRLSDYLFTLARYAAM KEGNQEKI YKKNDPS AES EGL 


6909 


3 


409 


t-ift 1 il lAvh 1 jjjj i vi^K J Jive EAJrtl U tVV^tmi xr vgiiguur *j*vr^* ^"^r 

S P Y LRGT I KMMQAVRQ AFQDQ DDRRTWDGRPL TMAATFDDCJjYA 

t nnmTTVT? C crrvrYTPWDTJT RTNTTFP'PPtT wSPAYLISElAMRRSRMS 
LYC 


6910 


1 


1068 


LVPWVI DS Y YYGKLV I APLNI VLYN 1 KTPHGPDLYGTEP WYFY 

L ING FLNFNVAFALALLVLPLTS lme yllqrfhvqnXiGHP ywlt 
LAPMYIWFI I FFIQPHKEERFLFPVYPLICLCGAVALSALQHSF 
LYFQKCYHFVTQRYRLElTYIVTSNVJLAIiGTVFLFGIjIiSFSRSVA 
LFRG YHG P LDLi Y PE FYR I ATDPT I HTWEGRP VNVCVGKE WYRF 

mt PVDQDVTnT cvPHYT.vni .fYTWRRTPREPKYSSNKEEWI SLAY 
RPFLDASP^SIO*IJlAFYVPFLSlXJYTVYVNYTIIJCPPJC^ 

KSGG 


6911 


1184 


966 


GEDAEEMETGNVANLIS I FGSSFSGLLRKSPGGGREEEEGEESG 
PEAAEPGQI CCDKPVLRDMNPWSTAIVAF 


6912 


1 


O e k*k 


iuviv t>vtttw<5FOMT .PTTTlSTGSALKAOS YEDAYRCI KSS ILLGSI 
SGGTD 1 1 SCFMGHNFS LPVY KGEI QARNLGMAVEAWNEEGKAVW 
GF^GELVCTKPIPCQPTHFWNDENGMKYRKAYFSKFPGIWAHGD 
YCRINPKTGGI VMLGRSDGTLNPNGVRFGSSE I YNIVES FEEVE 
DSLCVPQYNKYREEKVI LFLKMASGHAFQPDLVKRI RDAIRMGL 
SARHVPSLILETKGIPYTLNGKKVEVAVKQI IAGKAVEQGGAFS 
NPBTTiDLYRD I PELQGF 


6913 


1643 


. 1S58 


KKSHEESHKEELS YGAQAS LPLPCSDFR 


6914 


1251 


615 


ELAAECKS AG Y PGTLI P YRCDLSNEED ILSMFSAI RSQHSGVDI 
CINNAGLARPDTLLSGSTS GWKDMFNVNVLALS I CTRBAYQSMK 
ERNVDDGHI INI NSMSGHRVLPLS VTHFYS ATKYAVTALTEGLR 
QELREAQTHIRATCISPGWETQFAFKLHDKDPEKAAATYEQMK 
CLKPEDVAEAVIYVLSTPAHIQIGDIQMRPTEQVT 


6915 


254 


6S2 


GRSLSFKTFLIWVXiISIYQGGILMYGALVLFESEFVHVVAISFT 
ALILTELIJ^VALTVRTWHWLMWAEFLSLGCYVSS LAFLNEYFD 
VAFITTVTFLWKVSAITWSCLPLYVLKYLRRKLSPPSYCKLAS 


6916 


254 


652 


GRS LS FKTFL I WVLI S I YQGGI IWYGALVLFESE FVHVVAI S FT 
ALILTELLMVALTVRT>7HWLMWAEFLSLGCYVS S LAFIiNEYFD 
VAFITTVTFLWKVSAITWSCLPLYVLKYLRRKLS P PS YCKLAS 


6917 


254 


652 


GRSLS FKTFLI WVLIS I YQGG I LM YG ALVL FE S E FVHWAI S FT 
ALILTELUWALTVRTWHVJLMWAEFLSLGCYVSS1AFLNEYFD 
VAF I TTVTFLWKVS AI TWS CLPLYVLKYLRRKLS P PS YCKLAS 


6918 


28 


921 


PEAGTRS WREPD P EDLRRFLLS AACRS FPQWLPGGGGGQVSS CS 
DTDVPYI^LAVKSEPGRFAERQAVRETWGSPAPGIRLLFLLGSP 
VGEAGPDLDSLVAWESRRYSDLLLWDFLDVPFNQTLKDLLLLAW 
LGRHCPTVSFVLRAQDDAFVHTPALLAHLRAL P PAS ARS LYLGE 
VFTQAMPLRKPGGPFYVPES FFEGGYPAYASGGGY VIAGRLAPW 
LLRAAAR VAP FP FEDVYTGLC I RALGLVPOAHPG FLTAW PADRT 
ADHCAFRNLL L VR P LG PQ AS IRLWKQLQDPRLQC 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Ama.no acid segment containing signal peptide - " 
{A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H^Histidine, I=Isoleucine, K= Lysine, 
I*=Leucine, M=Me thi.oni.ne . N=Asparagine, 
P=Proline, Q=Glut amine, R=Arginine, 
S^Serine, T=Threonine, V»Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 


6919 


850 


41 


QGRRELSGS VFCP F I QQEPKEJtLTIjSEYHER VRSQGQQXiQQIjQA 
ELDKLHKEVSTVRAANS ERVAKLVFQRLNEDFVRKPDYALS SVG 
AS IDLOKTSHDYADRNTAYFWNRFS FWNYARPPTVILEPHVFPG 
NCWAFEGDQGQWT QI*PGRVQLSDI TLQHP P P SVEHTGG ANSAP 
RD FAVpFLLS FFTHQGI*QVYDETEVSLGKFTFDVEKSE I QTFHL 
QNDP PAAFPKVKI Q I LSNWGH PRFTCLYRVRAHGVRTS EGAEGS 
AQGPH 


6920 


1418 


591 


EAQGPSKVHLTIiKKKK 


6921 


2 


1711 


MNATRS EEOFHV INHAEOTIjRKMENYLKEKOL<ZDV1iL TAGHLR I 
P AHRLVLSAVSDYFAAMFTNDVLEAKQEEVRMEG VDPNAIJTS LV 
Q YAYTGVIiQLKEDTI ES LliAAACLLQIiTQVI DVCSNFL I KQLHP 
SNCLG IRS PGDAQGCTBljLNVPilUOrrMEHF I E V I KNQE FKLLPA 
NEIS KIiLCSDD INVPDEETI FHALMQWVGHDVQNRC^jELGMIjIjS 
YIRLPLLPPQIJjADIiETSSMFTGDLECX>KIiIJlEAMKYHl»l»PERR 
SMMOS PRTKPRKSTVGAIjYAVGGMDAMKGTTT I EKYDL.RTNSWL* 

HIGTMNGRRLQFGVAVIDNKLYVVGGRIX3LKT1OTVECFNPVGK 
IWTVMPPMSTHRHGIiGVATIiEGPMYAVGGHIX^WSYI.NTVERWDP 
EGRQWlTYVASMSTPRSTVGVVAIiNNXI»YAJGGR 
FDPHTNKWSLCAPMSKRRGGVGVATYNG FLYWGGHDAPASNHC 
SRl^DCVERYDPKGDSWSTVAPI^VPRDAVAVCPLGDKLYVVGG 
YDGHT YLNTVES YDAQRNEWKEEVP VNIGRAGACVVVVKL P 


6922 


| 1075 


369 


I/F P P AGIRHEVRDRERERERE H K W E K FPL.DSTGS EUCON I HS TT 

gl p p amqkvm ykgiiapedictlp^i kvtsgaki mgggst indvla 
votpknaac^dakaeenkkeplcrokqhpjcvldkgkpedw 
kgaqerlp1vplsgmynksggkvrltfkleqdqlwigtkertek 
lpmgsiknwsepieghf:dyhmmafqi>gpteasyywvywvptqy 
vdai kdtvlgkwqyf 


6923 


2469 


1660 


LGLFCIIJPIDTLCAVLiERJDTLS IRESRIiFGAWRWAEAECQRQQ 
I*PVT PGNKQKVJjGKALS IjI RF PliMT I E EFAAG PAQSGI I»SDREV 
VWLFIiHFTVNPKPRVEYIDRPRCCLRGKECCINRFGQVESRWGY 
SGTSDRIRFTVNRRISIVGFGLYGS IHGPTDYQVNIQI IEYEKK 
QTLGQNDTGFS CDGTANTFRVM FKE P IE I LPNVC YTACATLKG P 
DSHYGTKX3LKKVVHETPAASKTVFFFFSSPGNNNGTSIEDGQIP 
EIIFYT 


6924 


2210 


1235 


PEERVI CFVEYYIiTAFHEGRKGAIiAKKP YNPI IGETFHCSWEVP 
KDRVKPKRTASRSPASCHEHPMADDPSKSYICLRFVAEQVSHHPP 
ISCFYCECEEKRLCVNTHVWTKSKFMGMSVGVSMIGEGVI»RI*LE 
HGEEYVFTLPSAYARS ILTIPWVELGGKVSINCAKTGYSATVI F 
fTTKPFYGGKVHRVTAEVKHNPTITriVCKAH 

BTKVI DTTTI^VYPKKIRPLEKQGPMESRNLWREVTRYIjRLGDI 
DAATEQKRHLEEKQRVEERKRENLRTPWKPKYFIQEGDGSG I LQ 
SPLESTLMGIiEVQSFPV 


6925 


2 


1653 


RGGAAGAAMBPDSVIEDKT I ELMCSVPRSLWJLGCANLVESMCAL 
SCIiQSMPSVRCLQISNGTSSVIVSRKRPSEGNYQKEKDLCIKYF 
DQWSESDQVEFVEHLISRMCH YQHGH INSYUCPMLQRDFI TALP 
EQGLDH I AENI LS YLDARS LCAAELVCKEWQRVI S EGMLWKKL I 
ERMVRTDPLWKGLS ERRGWDQYIiFKNRPTDGP PNS FYRS LYPKI 
rQDIETIESNWRCGRHNLQRIQCRSENSKGVYCLOYDDEKIISG 
LRDNS I KIWDKTS LECLKVIjTGHTGSVIJCLQYDERVIVTGSSDS 
TVRWDVNTGEVLNTL IHHNEAV1»H LR FSNGLMVTCS KDRS IAV 
WDMASATDITLRRVX.VGHRAAVNWXJFDDKYIVSASGDRTI KVW 
STSTCEFVRTLNGHKRGIACLQYRDRiVVSGSSDNTIRLWDIEC 
GACLRVLEGHEELVRCIRFDNKRIVSGAYDGKI KVWDLQAALD P 
RAPASTLCLRTLVEHSGRVFRIjQFDEFQI ISSSHDDTILIWDFIj 
NVPPSAQNETRS PSRTYTYI SR 


1 6926 


1 ~T 


733 


SGRVAMDGLGLQFPEQGFPAGPPLTiP PHMGGHYRCCQSLGAPPL 
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SEQ 
ID 

MU : 


Predicted 
beginning 
nuci eociQc 
location 
corre sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 

corre sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, O Cysteine , D=Aspartic Acid, E= 

1 lit" am *i ^* H r- *i ^ p — DViOTivrl a 1 ar»i ^ — C\ \ vr { np 

H=Ristidine, I=Isoleucine, K=Lysine. 

Ij=beucine, M=Methionine , N=Asparagine , 

P= Proline, Q=Glutamine, R=Arginine, 

S=Serine, T=Threonine , V=Valine, 

W= Tryptophan, Y= Tyrosine, X*=Unknown, *=Stop 

Codon, /=possible nucleotide deletion, 

X^possible nucleotide insertion) 








TCYPLPTPDTSPI*DGVDPDPAFFAAPMPGDCPAAGTYSYAQVSD 
YAGPPEPPAGPMHPRLGPEPAGPS I PGIiLAPPSALHVYYGAMGS 
PGAGGGRGFQMQPQHQHQHQHQHHPPGPGQPTPPPEALPCRDGT 
DPSQPAEKLGEVDRTE FEQYLHFVCKPEMGLP YQGHDSGVNLPD 


6927 


2 


1484 


LTLCGDIQI^l^QNAiWRAAHI^EFHYQTKEDQSILHSLHRESS 
CQGFA WATDI#S TDItESQLS VS CKC YEAANBI LQFRDIiKS QN P EH 

KKS FSCFEKGI HNFES I E DATNAALLLCNTGRLMRI CAQAHCGA 
GDE1»KRBFSPEEGLYYNKAIDYYIjKAIjRSIjGTRDIHPAVWDSVN 
WELSTTYFTMATI^DYAPISRKAQEQIEKEVSEAMMKSIiKYCI) 
VDSVSARQPLCQYP^^TIHHPvLASMYHSCLRNQVGDEHLRKQHR 
VIADLHYS KAAXLFQLLKDAPCKLI^RVQLERVAFAEFQMTSQNS 
tAraKLKTl^GAI*DIMVRTEHAFQLIQKEXiIEEFGQPKSGDAAAA 
ADAS PSLNREEVMKLLS I FESRLS FLLLQ5 I KLT STKKKTSNN 
IEDDTlLKTNKHIYSQLLRATANKTATLLFJRINVI\mLLGQLAA 
GSAAi SNA VU 


6928 


1086 


777 


EAJDIj INNLLQ VKMRKRYS VDKTLSHPWLQDYQTWLDDRELECK 
IGERYITHESDDLRWEKYAGEQGLQYPTHLINPSASHSDTPETE 
ETEMKALGBRVS IL 


6929 


1749 


607 


RDQRG YRDDRS PAREPGD VS ARTRSGGGGGRSATTAMP P P VPNG 
NLHQHD PQDLRHNGNVWAGRPS CSRG PRRAI Q KPQ P AGGRRSG 
RGPAAGGLCIXJPPIX^TCVPEEPPVPPMDWEALEKHLAGLQFRE 
QEVRNQGQARTNSTSAQKNERESIRQKXJU/3SFFDDGPGIYTSC 
SKSGKPSIiSSRLQSGMNLQICFVNDSGSDKDSDADDSKTETSlJ) 

1 C»C<C»oc.Jai-il JUnUe J-i J icyJU\i^ftBiiRn 
AIAMAKPMAKMQ VEVEKQNRKKS P VADIiLPHMPH I SECLMKRSL 
KPTDLRDMTIGQLQVIVNDI^QlESIjNEEIiVQLIiLIRDEIiHTE 
QDAMIiVDIEDLTRHAESQQKHMAEKMPAK 


6930 


131 


545 


FMTTAlIVFVSIiFQMRNNFPJiYFIEPSQLKLFYDVITWIVTQVAI 
SYTWPFVLI^IKPSLTFYSSWYYCI»HILGII*VI,LIJ,PVKXTQR 
RKNTHENIQI^QSKKFDEGENSI^QNSFSTTNNVCNQNQEIASR 
HSSLKQ 


6931 


2 


659 


FVERLPNRPACLLVASGAAEGVSAQS FLHCFTMASTAFNLQVAT 
PGGKAMT?PVT>VTF WARWVODFRI^KAYASPAKLES TDGAR YHAli 
LIPSCPGALTDIASSGSLARI LQHFHSES KPICAVGHGVAALCC 
ATNEDRS WVFDS YS LTGPSVCFXVRAPGFARLPLVVEDFVKDSG 
ACFSRS E PDAVHVVLDRHLVTGQtf AS S TVPAVQNI*L FLCGS R K 


6932 


2 


1131 


FVDS PG QGEQAEE EEGG I QMN S RMRAH S PAEGAS VES SS PGP KK 
S DMCEGCRSLAAGHPG YISHDKETS I KYVSHOHPSHPOLFS I VR 
QACVRSLSCEVCPGREGPIFFGDEQHGFVFSHTFFI KDStiARGF 
QRWYS I ITIMMDRI YIjINS WPFIiIjGKVRGI IDELQGKALKVFEA 
EQFGCPQRAQRMNTAFTPFLHQRNGNAARS LTSLTSDDNLWACL 
HTSFAWiajKACGSRIiTEKLIiEGAPTEDTLVQMEKLADLEEESES 
WDNSEAEEEEKAPVLPESTEGRELTQGPAESSSLSGCGSWQPRK 
LPVFKSLRHMRQVGGRGTAHHELRRRANHGLCLPTRLASGPSTIi 
KTLQEVTDSLLGGWLMAQGVGGI I 


6933 


1431 


890 


SLNLHCTLPPPPHQYPAGYPSDKEGKKPKGQS KKQPSGTTKRPI 
SDDDCPS AS KVYKAS DSAEA1 EAFQLTPQQQHIjIREDCQNQKLW 
DEV1>SHI»VEGPNFIjKKLEQSFMCVCCQELVYQPVTTECFTWVCK 
DCI^RSFKAQVFSCPACPJ|DIiGQNYIMIPNEILQTLLDI.FFPGY 
SKGR 


6934 


3030 


2588 


DRDHSQCGGIRRVALARVSSVKLI SKAKIRTVKMTFI IVIAFI V 
CWTP FFFVQMWS VWDANAPKEASAFI I VMLLASLNSCCNPWIYM 
LFTGHL FHEliVQRFIiCCSAS YLKGRRI/3ETS ASKKSNSS SFVliS 
HRSSSQRSCSQPSTA 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corr espondi ng 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A- Ala nine , C=Cysteine , D=Aspartic Acid, E= 
Glutamic Acid, F=Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K= Lysine , 
L= Leucine , M=Methionine, N=Asparagine . 
P-Proline, Q=Glutamine , R=Arginine, 
S=Serine , T«Threonine , V=val ine , 
W- Tryptophan , Y=Tyrosine, X=unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 


6935 


886 


543 


NSALYVAGGNDGTSCLNSVERYS PKAGAWES VAPMN IRRSTHDL 
VAMDGWLYAVGGNDGSSSLNS ISKYNPRTNKWVAASCMFTRRSS 
VGVAVIiKTiltNFPPPSSPTLSVSSTSL 


6936 


1347 


567 


RSHRRQF1*SRALI»EFFGICSHPPPHRLFRKSIjNVGLHYSHIPFLT 
TCLHFLRKRLQKGEVGLSVET S KPQVP VGG LS RKKVPQ E PWATV 
MEKPJJQEAQLYKEEGNQRYREGKYRDAVSRYHRALIiQLRGLDPS 
LPSPLPNLGPC^PALTPEQENILHTTQTDCYNmAACIjbQM 
NYER VREY S Q KVLE RQ P DNAKAL YRAGVAF FHLQ D YDQARH YLL 
AAVNRQPKDANVRRYLQLTQSELSSYHRKEKQLYIjGMFG 


6937 


1 


727 


AVEFRCCPGRDPAC FARG WRLDR VYGTCFCDQACRFTGDCC FD Y 
DRACTARPCFVGEWSPWSGCADQCKPTTRVRRRSVQQEPQNGGA 
PCPPLEERAGCl^EYSTPQGQDCGHTYVPAFITTSAFNKERTRQA 
TS PHWSTHTEDAGYCMEFKTESLTPHCALENRPLTRWMQ YLREG 
YWCVDC^PPAMNSVSI^CSGIX3I^IX5NOTI*HWQAJG^ 
TWKKVRRVDQCSCPAVHS F I F I 


1 6938 


3 


719 

ft 


NSRXL.ELAERVDTDFMQLKKRRQSSEKEN DSGTLDTVGAVVVDH 
EGNVAAAVSSGGLALKH PGRVGOAALYGCGCWAENTGAHNPYST 
AVSTSGCGEHLVRTI LARECSHALQAEDAHQAT.T .FTMQNKFISS 
P FLASEDGVLGGVI VI*RSCRCSAEPDSSQ13KQTIXVEFLWSHTT 
BSMCVGYMSAQDGKAKTHISRIiPPGAVAGQSVAIEGGVCRLGEP 
SELTLQAECEASQRHFRT 


6939 


3 


810 


KVTAPRRPQR YSSGHG SDNSS VLSGELP PAMGRTALFHHSGGS S 
GYESIiRRDSEATGSAS SAPDSMSESGAAS PGARTRSLKSPKKRA 
TGLQRRRLI PAPLPDTTALGRKPSLPGQWVDLPPPLAGSLKEPF 
E I KVYEI DDVERIjQRPRPTPREAPTQGLACVSTRLRIiAERRQQR 
LRE VQAKH KHLCEELAETQGRLMLE PGRWLEQFEVDP ELEPES A 
E Y1AALERATAAI£QCVNLCKAHVMMVTC FD I S VAAS AAI PGPQ 
EVDV 


694 0 


1188 

• 


496 


GKMAAQPLRHRS RCAT P PRGDFCGGTERA I DQAS FTTSME WDTQ 
WKG S S P LGPAGUGAEE PAAGPQLPSWLQPERCAVFQCAQ CHAV 
IiADSVHLAWDLSRSLGAWFSRVTNNWLEAPFLVGIEGSLKGS 
! TYNLLFCGSCGIPVGFHLYSTHAALAALRGHFCLSSDKIWCYLL 
KTKAI VNASEMDIQNVPLS EKIAELXEKIVLTHNRLKS LMKI LS 
EVTPDQSKPEN 


6941 


1 


713 


SLSRADSDPHGPHTCGHVLNVI IGSNVLAIiAEAQRQAEALGYQA 
VVLSAAKG^DVKSMAQFYGLIiAHV7ARTRLTPSMAGASVEE12AQL 

helaaelq i pdlqleealbtmawgrgp vcllaggeptvqlgx3sg 
rggrnqelalrvgaelrrwplgpidvlflsggtdgqdgpteaag 
avfvtpelasqaaaegldiatftiahndshtffcxrlqggahlijhtg 
mtgtnvmdthllflr pr 


6942 


1 


246 


GDYVERYDPKTDTWTMGAPLSMPTNAVGGCLLGDRltYADGGYDG 
QTYIiNTMESYDPC/I^EWTQMASIiNIGRAGACVVVI KQP 


6943 


1 


739 


PMATG DGAKT LAI HVKALTADS I R I T W KATLPAi* t> t KL,i> w l>KJJj 
HS PAGGS ITETLVQGDKTEYIjLTALEPKPTYI I CMVTMETTNAY 
VADETP VCAKAETADS YG PTTTLNQEQNAGPMAS L PLAG I IGGA 
VAL VFL FLVLGAI CWYVHQAGELLTRERAYNRG S RKKDDYMESG 
TKKDNS ILE IRGPGLQMLP I MP YRAKEEYWHTI PPSNGSSLCK 
ATHTIGYGTTRGYRDGGIPDIDYSYT 


6944 


960 * 


156 


VANILLNGVKYESBLTGS SERAEQPLSVGRLCSTI CNMPKALRT 
LCViraFLGWLSFEGMLLFYTDFMGEWFQGDPKAPHTSEAYOKY 
NSGVTMGCWGMCI YAFSAAFYS A I LEKLEEFLS VRTLYFI AYLA 
FX3LGTG1ATLSRNLYWLSLCITYGILFSTLCTLPYSLLCDYYQ 
SKKFAGSSADGTRRGMGVDI SLLSCQYFLAQ ILVSLVLGPLTSA 
VGSANGVMYFSSLVSFLGCLYSSLFVIYEIPPSDAADEEHRPLL 

LNV 
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SEQ 
ID 
NO: 


Predictecl 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
<A=Alanine , C=Cys teine , D=Aspartic Acid, B= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
I>=I*eucine, M=Me thionine , N=Asparagine, 
P=Proline, Q=Glut amine, R=Arginine, 
S=Serine, T= Threonine , V= Valine , 
W= Tryptophan, Y= Tyrosine, X= Unknown , *=Stop 
Codon, /=possible nucleotide deletion, 
\=possihle nucleotide insertion) 


6945 


2067 


179 


EGEDRGLPRTMGAALGTGTRI^WPGRACGAIjPRWTPTAPAQGC 
HSKPGPARPVP LKKRGTDVTRNPHIjNKGMAFTIjEBRIjOLG I HGL 

RVLTSDVEKFMPIVYTPTVGLACQHYGLTFRRPRGLFI tihdkg 
HLATMIjNSW p edn ikawvtdgeri LGLGDLG C YGMG I P VG kla 
LYTACGGVNPOX2CLPVLLDVGTNNEELLRDPLYIGLKHQRVHGK 
AYDDLI^EFMQAVTDKFGINCIjIQFEDFANANAPRIiLNKYRNKY 
CM FNDD I QGTAS VAVAG I LAALR I T KN KL>SNHVFG FQGAGEAAM 
G\ IAHLLVMALE \KEGVPKA\EATRXI W\MVDF XKG&IVQGRDH 
LNHE KEMFAQD \ HPEVNS LEEWRLVKPTAI IGVAAIAEA\FTE 
QILRDMASFHERP\l I FALSNPTS KAECTA\ E KCYRVTEG PRGF 
FAS\GSPF*GVLIWEMGKTFIPGGRGNNA*RVPRGWQLGVHSPG 
GDPGH I P\ DE I FLPDSRAKLPQEVS EQHI*SQGRli Y P \ PLST \ IR 
NVFLRIAI KVFD * GYKHNI»V\ S YYP EP KD\ KEAFCKI PG S YTPD 
YDSFYT/VDS YI WAQGKAMNVQTV 


6946 


133 


2551 


SCEYSGI TVAPGDPCPGVAHLIiAPSWASDTPESLWAIjCTDFCIiR 
NLDGTIJGYIjIiDKETLRIjH PD I FliPSE I \CDRLVNE YVEJbVNAAC 
NF\EPHE\SFFNPLFRDPRKQPASRRIHL\RED\LVQD\QD\LE 
AIRKQDL\VEL\i^TN\CEKLSAKSLOTI^FSHTi^GVP*AFPG 
CSTNIIjLJjRKENPGGL/CEDEYIjFNPTCQVIjVXDFTFEGFSRLiR 

fNlkix^rmidwpvesXi^rpi^slaaldlsgiqtsdaaXfltq 

WKDSIj\VSLVL\YNMDIjSDDHIR\VI VQLHKI»RHTtniSRDRI>SS 
YYKFKLTREVliSLPVQKLGNLMSIiDI SG\HMILENCS IS KIGKR 
EAGQTS I \EPS K\ S S 1 1 P FRGFEGG PLQF \IX3 VF * G I FCGRLTH 
I PAYKVSGDKNEEQVIiNAI EAYTEHRPE ITSRAINLLFDIAR I E 
RCNQI^RALKLVITALKCHKYDRNIQVTGSAALFYIjTNSEYRSE 
QSVKLRRQVI QVVLNGMESYQEVTVQRNCCLTLC2JFS I PEELEF 
QYRRVNELLLS I LNPTRQDES IQRIAVHLCNALVCQVDNDHKEA 
VGKMGFVV1WLK1#TQKKLI*DKTCDQVMEFSW\ 
NCEMFLNFNGMKLFLDCIjNEFPEKQEL»HRNMI^ 
RPQLMTSQFI SVFSNLLESKADGIEVS YNACGVI>SHIMFDGPEA 
WG VCE PQREEVEERMWAAI QSWD INS RRNINYRS FEPILRLLPQ 
G I SP VSQHWATWALYNIjVS VYPDKYCPLb I KEGGM PLIiRDI I KM 
ATARQETKEMARKVIEHCSNFKEENMDTSR 


w ^ "1 / 


o 




S QS FPAPRSQQRVASGGRS KVPLKQGRS LMDW I RLTKSGKDLTG 
LKGRL I EVTEEELKKHNKKDDCWIC IRGFVYNVS P YMEYHPGGE 
DEXMRAAGSDGTELFDQVHRWVNYESI^KECDVGRMAI KPAVLK 
DYPJSEEKXVLNGMLPKSQVTDTIjAKEG PS YPS YDWFQTDSIiVTI 
/EHI Y * TEGYQFRLNNS *SSE* FI.YSRNNY*GLIiIS YTYW/R* A 
MRFRKI FIXXSL/CESVGKIEIVJ^KXEirrSWDFLGHPLKNHNSL 
I PRKDTGLYYRKCOL ISKEI?VTBiyrRliFCLMI^PS THLQ VP IGQ 
HVYT.KLP ITGTE I VKP YTP VS GSI*LSE FKEP VLPHNKYI Y FL IK 
IYPTGLFTPELDRLQIGDFVSVSSPEGNFKTSKFQELEDLFLLA 
AGTGFTPMVKIIiNYALTDI PSIiRKVKLMFFNKTEDDIIWRSQLE 
KLAFKDKRLDVBFVLiSAP I S BWNGKQGH I S PAIiIaS E FLiKRNLDK 
SKVLVCI CGPVPFTEQGVRLLHDLNFSKNEIHSFTA 


6948 


104 


58 


PDGAHSFFPDEYFTCSSLCI^CGVGOCKSMNHGKEGVPHEAKSR 
CR YSHQYDNRVYTCKACYERGEEysVVPlCTSASTDS PWMGLAKY 
AWSGYVI ECPNCGVVYRSRQYWFGNQDPVDTWRTE I VHVWPGT 
IX3FLKDNNNAAQRLIJ3GMNFMAQS VS ELSIX5PTKAVTS WLTDQ I 
APAYTOPNSQII^CNKCATSFKDNDTKHHCRACGEGFCDSCSSK 
TRPVPERGWGPAPVRVCDNCYEAR/TRPVS CYRGTSGR * RRRRT 
QETVE 


6949 


152 


4656 


GLRLCLSRPLiTRPGDDSVGGSAMASGAGGVGGGGGGKI RTRRCH 
QGPIKPYQC<3R0^HC^ILSRVTESVKNIVPGWLQRYFNKNEDVC 
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SEQ 
ID 
NO: 



6950 



Predicted 
beginning 
nucleotide 
location 
corxe spending 
to first 
amino acid 
residue of 
amino acid 
sequence 



Predicted end 
nucleotide 
location 
correspondi ng 
to first 
amino acid 
residue of 
amino acid 
sequence 



2585 



411 



6951 



1940 



239 



Amino acid segment containing signal peptide 
{A-Alanine, C=Cysteine, D=Aspartic Acid, B= 
Glutamic Acid, F= Phenyl alanine , G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M-Methionine, N=Asparagine, 
P-Proline, Q=Glutaraine, R=Arginine, 
S=Serine, T=Thxeonine, V= Valine, 
W= Tryptophan, Y=Tyrosine / X -Unknown , *=Stop 
Codon, /=possible nucleotide deletion, 
\sposaible nucleotide insertion) 



SCSTDTSBVPRWPENKEDHIiVYADEESSNITDGRITPEPAVSNT 
BEPSTTSTASTXYPDVLTRVSLYRSHI^FSMLESPALHCQPSTS 
SAFPIGSSGFSLVKEIKDSTSQHDDDNISTTSGFSSRASDKDIT 
VSKNTSLPPLWSPEAERSHSI>SQHTATSSKKPAFNI^AFGTLSP 
SIX^SSILKTSQIiGDSPFYPGKTTYGGAAAAVRQSKIiRNTPYQA 
PVRR QMKAKQLS AQS YG VTS S TARR I LQSLEKMSS PLADAKR I P 
S IVSSPIiNSPLDRSGID I TDFQAKREKVDSQY PPVQRLMTPKPV 
S IATNRSVYFKPSbTPSGEFRKTNQRIDKKCSTGYEKNMTPGQN 
REQRESGFSYPNFSLPAANGLSSGVGGGGGKMRRERHAFVASKP 
LEEEEMEGPVLPKISLP ITSSSLPTFNFSSPE ITTSS PS PINSS 
QALTNKVQMTSPSSTGSPMFKFSSPI VKSTEANVLPPSS IGFTF 
SVPVAKTAELSGSSSTLEPI ISSSAHHVTTVNSTNCKKTPPEDC 
EGP FRPAE I LKEGS VLD I LKS PGFAS PKIDSVAAQPTATSP WY 
TRPA I SSFSSS GIGFGESLKAGSS WQCDTCLIiQNKVTDNKCI AC 
QAAKLS PRIXTAXQTGIETPNKSGKTTI*SASGTGFGDKFKPVIGT 
WDCDTCLVQNKPEAI KCVACETPKPGTCVKRAIjTLTVVs ESAET 
MTAS S SSCTVTTGTLGFGDKFKR PIGS WBCS VC CVSNNAEDNKC 
VSCMSEKPGSSVPTSSSSTVPVSLPSGGSLGLEKFKKPEGIWDC 
ELCLVQNKADSTKCLACESAKPGTKSGFKGFDTSSSSSNSAASS 
SFKFGVSSSSSGPSCyTLTSTGNFKFGDQGGFKIGVSSDSGYINP 
MSEGF* FSKHI VGFKFGVSSESKPEEVKKDSKNDNFKFGI^S FGL 
SNPVFLTPFQFGVSNIiGQEEKKEEIiliKSSCAGFRFGTGVINSTR 
VPANTIVTSENKSS FNLGTI ETKSVS VAPLKCQTS EAKKEEMPA 
TKGG FS FGNVE P AS LPS AS VFVXiGRTEE KQQE P VTS TS L V FGEG 
IOUTMKEPKC\QPVFSFGEFQRQTKDENSSKSTFSFSMTKPSEKE 
SEQPAKATFAFGAQTNTTADQGAAKPDLS YLNNSSSS SSTPATS 
AGGG\ IFGSSTSSSNPPVATFVFGQSSNPGSSS \AFGNTAESST 
SQSLLFSQDSK3ATTSSTGTAVTPFVFGPGASSNNTTTSGFGFG 
ATTTSSSAGSSFVFGTGPSAPSASPAFGAKQTPTFGQSQGASQP 
NPPGFGS I SSSTALFPTGSQPAPPTFGTVSSSSQPPVFGQQPSQ 
SAFGSGTTPNSSSAFQFGSSTTNFNFTNNS PSGVFTFGANSSTP 
AASAQPSGSGGFPFNQSPAAFTVGSNGKNVFSSSGTSFSGRKIK 
TAVRRRK 



PRPGSRSGLCRRAGERGAVRAGGLSRRTRAE * I MDE LHYQDTDS 
DVPEQRDSKCKVKWTHEEDEQLRALVRQ FGQQDWKFLASHFPNR 
TDQQCQ YR WLRVLNP DL VKGP WTKEEDQKVT ELVKKYGTKQWTL 
I AKHLKGRLGKQCRERWHNHLNPEVKKSCWTEEEDRI I CEAHKV 
LGNRWAEIAKMLPGRTDNAVKNHWNSTIKRKVOTGGFLSESKDC 
KP P VYIiKL E LED KDG LQS AQPTE GQGS LLTNW PSVP PT I KEEEN 
SEEELAAATTSKEQEPIGTDLDAVRTPEPLEEFPKREDQEGSPP 
ETSLP YKWWEAANLLI PAVGSSLSEALDL IESDPDAWCDLSKF 
DL PEEPSAE DS INNS LVQLQASHQQQVLPPRQ PSA\ LVPSVTE Y 
RLDGHTISDLSRSSRGELIPISPSTEVGGSGIGTPPSVLKRQRK 
RRVAI*SPVrENSTSLSFLDSCNSLTPKSTPVKTLPFSPSQFLNF 
WNKQDTLELES PSLTSTPVCSQKVVVTTPLHRDKTPLHQKHAAF 
VTPDQKYSMDNTPHTPTPFKNALEKYGPLKPLPC/TPHLEEDLKE 
VLRSEAGIELI IEDD IRPEKQKRKPGLRRSP I KKVRKSLALDI V 
DEDMKIJ^MSTLPKSLSLPTTAPSNSSSLTLSGIKEDNSLLNQGF 
LQAKPEKAAYAQKPRSHFTTPAPMSSAWKTVACGGTRDQLFMQE 
KARQLLGRLKPSHTSRTLILS 

AGPDDTMKRSIjQALYCQIjIiSFIjLIIJUjTEALAFAIQEPSPRESL 
QVLPSGTPPGTMVTAPHSSTRHTSVVMLTPNPDGPPSQAAAPMA 
TPTPRAEGHPPT\TPSPPSLRQ* PPP IIJCAP/SSTGPAPAAMAT 
TSSKPEGRPRGOJU\PTILLTKPPGATSRPTTAPPRTTTRRPPRP 
PGSSRKGAGWSSRPVPPAPGGHSRSKEGQRGRNPSSTPLGQKRP 
LGKX FQI YKGNFTGSVE PE PSTLTPRTPLWGYS SS PQPQTVAAT 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A^ Alanine , C=Cysteirie , D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine , K=Lysine, 
L= Leucine , M=Methionine, N=Asparagine, 

S=Serine, T-Threonine, V=Valine, 
W=Tryptophan, Y=« Tyrosine, X= Unknown , *=Stop 
Codon . /=TX>ssible nucleotide deletion 
\=possible nucleotide insertion) 








TVPSNTS WAPTTTS l^PAKDKPGIiRRAAQGGGSTFTSQGGTPDA 
TAASGAPVS P/PSCPS AFSAP PPR* PTGWPQP* * LLAYCYP\CT 
SRPIiSTS SGVFTAATG PTPAAFDTSVSAPSQGI PQGASTTPQAP 
THPSRVSBSTISGAKBETVA\ PSP * PTGCPVLSPQWYPQPQAI S 
STAWS PPGPGSLGQQGTSPMWPRGTNRSTEPPSA* ARW1S PG* S 
WPSACPSPP\lV2PATCVl^BEEEEDRQPGBQPEAYGKEfTHHPGT 
TFQQAC \ RGAAPG E I PVPLKPLRTQLSEPRSPANGDYRDTGMVP 

c 


6952 


658 


304 


PESEGESGEMTDRYTIHSQLEHLOSKYIGT\ATPTPPSGSG\CE 
PTPRLV1^LHGPLRPSQLLRHCX3E*EQSASPU^L1^KDASA1,WT 
ASRQAJRGELRIjCIiTTAVRGTS ps vs p vcqss 


6953 


1512 


349 


nwgktrajlaasgkhvpfgkqtnpnks/ vhcds *g**rrettqdbs 
fs phfrgkmggw\ kx.ekelenteqpvggneg *ehevtgni*nsd 
pij^lcqcplcqldcgsreqliahvyqhtaawsaksymXcpvc 

UKAXtoS PGSIasKHIjJj I HSEDQRSN CAVCGARFTSHATFNSEKLP 
BVIji^ESLPTVHI^GPSSAEGKDfAFSPPVYPAGIl^VCNNCAA 
YRKLIiEAQT PS VRKWALRRQNEPLEVRLQRLERERTAKKSRRDN 
ETPEEREVRRMRDREAKRl^RMQETDEQRARRLQRDREAMRIiKR 
AIETPEKRQARilRFJlFJUa^LKRRLEKMDtMMLRAQFGQDPSAMA 
ALAAEMNFFQIiPVSGVEL.DSQLIjGKMAFEEQNS sslh 


6954 


819 


1 


PP PP F 1 1 PSHP REACT* AG * KRSGDS ECS P P VEQ * A* TRAAAQN 

* PQR * RWTEGNS PQASAVATPGQGAS PAAPRCTP* PSRRHRRLP 
PGARP PAG * AAPAPT KP WLAG PAS APQPGAAPLS P PAP PLIRTR 

* CAG AAARGR PRRDRS PRPRTPGGCSWSEPRTPPAVSASAQTPS 
UAG w AeGR*GQKQRPSTGR * P PGVGGAGRSHRREGTI PGNPHPR 
AS * RAGWQR* PGP / REWGL * E PQGEEMSGPGGPGGAP PNQVGSS 
VMQAMSTGI 


695S 


1968 


782 


PPGRRQVRAQVAGAPVGHWGTRARQVlcrGGRRRARRTMPFliGQD 
WRS PGW S W I KTEDG WKRCESCSQKLERENNHCN I SHS I ILNSED 
GEIFNMEEHEYASKKRKKDHFRNDTNTQSF^ 

ERHGYCTI^EAFNRLDFSSAIQDIPJRFKYVVKXJiQLIAKSQLTS 

JL^GVAQKNYFNIIJDKXVQICVLDDHHNPRLIKDIXQDLSSTI,CIL 

/N* RSREVCISGKHQYLDLP I RNYSRLATTATGSSDD* ASE \NG 

LTLSDLPLHMUWILYBPSDGVTOIITl^QVTPTLY 

KKX^YHFAEKQFCRHLILSEKGHIE^KLMYFALQKHYPAKEQY 

GDTMFCRHCSILFWKDSGHPCTAADPDSCFTPVSPQHFIDLFK 

F 


6956 


8605 


3839 


QTSTS I FASPTS PPVI^ESVIjQDNSFDLNNGSDAEQEEMETQSS 

dfppsltqpapdqsstiqlhpats pavspttspavslws paas 
peispevcpaastws pavfs wspassavlpavsusvpltasv 

TSPKASPVTSPAAAFPTASPANK1TVSSFLETTADVEEITGEGLT 
ASGSGDVMRRRIATPEEVRLPLQHGWRREVRI KKGSHRWQGETW 
YYGPCGKRMKQFPEVIKYIiSRNVVHSVRREHFSFSPRMPVGDFF 
EERDTPEGW2WQI^ASEIPSRIQAITGKRGRPRNTEKARTKEV 
P KVKRGRGRP PKVKI TELLNKTDNR PLKKLEAQETLNEEDXAKI 
AKSKKKMRQKVQRGECXyraiO^G^UWKRKQETKSLKQKEAKKKS 
KAEKETCGKTKQEKLKEKVKREKXEKVK^ 

KTIJ\l^RRLEERQRQQMII*EE^!KKPTEDMCI*TDHQPIiPDFSRVP 
GLTLPSGAFSDCLTIV^FI^SFGKVl^FDPAKD^SI^VljQEGL 
LCQGDSI^EVQDIxLVRIjIjKAAIjHDPGFPSYCQSIjKILGEKVSEI 
PLTRDNVS E ILRCF^iMAYGVEPALCDRLRTQPFQAQPPC^KAA V 
LAFLVHELNGSTLI INEIDKTLESMSS yrknkwi vegrlrrlkt 
VloAmTGRSEVEMEGPEECl^RRRSSRINIF^rrSGMEEEEEEES I 
AAVPGRRGRRDGEVDATASS I PELERQIEKLS KRQLFFRKKLLH 
SSQMLRA VS LGQDR YRRR YWVLP YLAG I FVEGTEGNIi VP E EV I K 
KETDS LKVAAHASLNPALFSMKMEl^VGSNTTAS S PARARGR PRK 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corre spond i ng 
to first 
amino acid 
residue of 
amino acid 
sequence 


| Predicted end 
nucleotide 
location 
1 corresponding 
1 to first 
j amino acid 
| residue of 
I amino acid 
j sequence 


Amino acid segment containing signal peptide 
(A^Alanine, C= Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl al anine , G=Glycine, 
H=Histidine, I=Isoleucine, K=Iiysine . 
I>-Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glutamine. R==Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan , Y=Tyrosine, X= Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=pos3ible nucleotide insertion) 








TKPGSMQPRHTiKSPVRGQDSEQPQAQMPEAQLHAPAQPQPQLQ 
LQLQSHKGFIiEQEGS PI>SIjGQS QHDI*SQSAFLSWLSQTQSHSSL 
LSSSVLTPDSS PGKLDPAPSQPPEEPEPDEAESSPDPQALWFNI 
SAQMPCNAAPTPPPAVSEDQPTPSPQQLASSKPMNRPSAANPCS 
PVQFSSTPIiAGLAPKRRAGDPGEMPQS PTGLGQP KRRGRP P S KF 
FKQMEQRYI/TQDTAQ P VP P EM CS GWWW I RD PE MLDAMLKALHP R 
GIREKALHIOilJ^KHRDFTjQEVCIiRPSADPIFEPRQLPAFQEGIM 
S WS P KEKTYETDIiAVLQWVEELEQRVl MS DLQI RGWTCPS POST 
REDLAYCEHLSDS QED I TWRGRGREGLAPQRKTTNPLDIAVMRI* 
AALEQNVERRYLREPLWPTHEVV1*EKALI*STPNGAPEGTTTEIS 
YE I TPR IRVWRQTLERCRS AAQVCLCXGQLERS IAWEKS VNKVT 
CLVCRKGD1TOEFTJ^CI>GCDRGCHIYCKRPKM 
CLAQQVEGEFTQKPGF P KRGQKRKSGYS LNFS EGDGRRRRVL.L.R 
GRESPAAGPRYSEEGI*SPSKRRRLSMRNHHSDLTFCEI ILMEME 
SHDAAWPFLEPVNPRLVSGYRRI I KNPMD FS TMRERLLRGGYTS 
SEEFAADALLVFDNCQTFNEDDSE VGKAGHI MRRFFE \ SRWEEF 
YG^KQGQSVROGRWGVTIiWHLPPTFQTKTCHFHLJiMLPWVOTQV 
RYNPDF 


6957 


82 


3 514 

• 


HI* I VAMPEPTKKEENE VPAPAPP P EE PS KE KEAGTTPAKDWTL V 
ETP PGEEQAKQNANSQLS I LF I EKPQGGTVKVGED I TF I AKVKA 
EDLS EKPTINGS RIG^DLASKAGKHLQIjKBTFERHSRVYTFEMQ 
IIKAKDNFAG1TYRCEVTYTCDKF1)SCSFTDIiEVHE5TGTTPNIDIR 
SAFKRSGEGQ EDAGELD F S GUjKRREVKQQEE E PQVDVWE LL.KN 
TKPSEYEKXAFQYESPTCSGMLKRIjKRS I REEKKSAAPAKI LDP 
VYQVDKGGRVRFWE LAD P KLEVKWNKNGQELRP STKY I FEDTR 
CQS I I*NIDWCQMTDDSEYYVTAGDEKCSTELLVREPPIh3VTKQI, 
EDTTDYCGERVEI*ECEVSEDDAQVKWPKNGEEIILVQTRYRIRV 
EGKKHILI IEGATKADAADYSVMTTGQQSSAICLSVDIiKPLKILT 
PL.TDQTVNLG KE I CUCCE I SEN IPG KWTKNGL PVQESDRLKWH 
KGRIHKliVIDHALTEDEGDYVFAPDAYl^VTLPAKVHVIDPPKI I 
LDGLDADNTVTVI AGNKLRLEI P ISGEPPPKAMWSRGDKAIMEG 
SGRI RTE S YPDS STLVID IAE RDDS GVYHINIiKNEAGEAHAS I K 
VKVVDFPDPPVAPTVTKVGDDWCIMNWEPPAYDGGSPILGYFIE 
RKKKQSSRWMRI*NFDLCKETTFEPKKM IEGVAYEVRI FAVNA\ I 
GISKPSMPSRPFVP1AVTSPPTLLTVDSWDTTVTMRWRPPDHI 
GAAGIJ)GYVLEYCFEGSTSAKQSDENGEAAYDIiPAEDWIVANKD 
LIDKTKFTITGLPTDAKI FVR VKAVN AAGAS E P KYYS Q P I L VKE 
1 1 EPP KIHSPKHLKQTY IRRVGDRV I IAFZ PFQGKPRPEI/TWKKD 
GAE IDKNQINIRNSETDTI I FIRKAERSHSGKYDLQVKVDKFVE 
TASIDIRIIDRPGPPQrVKIEDVWGRNVALTWTPPKDDGNAAlT 
GYTIQKADKKSMEWL.RVIEHI IEPVPHTELVIGNEYYFRVFSEN 
MCGLSFJ3ATMTKESAVIARDGKIYKNPVYEDFDFSEAPMFTQPL, 
VNRLCHSG YMATLNCSVRGNP KPK1 TWMKNKVAI VDDPR YRMFS 
NCX^CTI^IRKPSPYDGGTYCCKAVNDLGT^IECKIjEVKVIAQ 


6958 


274 


1663 


PRTSRVKTEGSQGSSAMDFS VKVDIEKEVTCP I CLELI*TEPI*SL. 
DCGHSFCQACITAKIKES VI ISRGESS CPVCQTRFQPGNLRPNR 
HIJ^VERVKEVKMSPQEGX?KRDVCFJmGKKLQI FCKEDGKVI C 
WVCEI>SQEHOXffiO/rraiNEVVXECQBKLQVALQRI,I KENQEAEK 
LEDDIRQERTAWKNYIQIERQKILKGFNEMRVILDNEEQRELQK 
LEEGEVNVLDNLAAATDQLVOQRQDAS TI>I S DDQRRLRGS SVEM 
LQDVIDVMKRSESWTLKKPKSVSKKLKSVFRVpDLSGMIjQVLKE 
LTDVQYYWVDVMLNPGSATSNVAI SVDQRQVKTVRTCTFKNSNP 
CDFSAFGVTGCQYFSSGKYYWEVDVSGKIAWIIXTVHSKISSLNK 
RKSSGFAFDPSVNYSKVYSRYRPQYGYWVIGLQNTCEYNAFEDS 
SSSDPKVLTX,FMAV\ LPWLGFS 


6959 


1 | 


1469 


SLVHWEFGRGIBDFPYLFFQLTHCQQRICSVTQAGVQWCDHSS 
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ID 
NO: 
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beginning 
nucleotide 
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corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location ' 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A^Alanine, C= Cysteine, D=Aspartic Acid, E- 
Glutamic Acid, F=Phenylalanine, G=Glycine , 
H=Histidine, i=Isoleucine. X=Lysine, 
L= Leucine, M-Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S^Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *-Stop 
Codon, /=»possible nucleotide deletion, 
\=possible nucleotide insertion) 








I^P<^PG1^QSSHI^I>I^SRDYRMI>SSFNEWFWQDRFWL»PPNVT 

VTTEIiEDRIX3RVYPHPQDL>LAALPLALV^ 

RWLGVRDQTRRQVKPNATLEKHFLTEGH^ 

TLQQTQRW FRRRRITQDRPQLTKKFCEASWRFX^LS S FVGGLSV 

LYHESWLV7APVMCWDRYPNQLTIiSCPAADSEA\SLYWWYliIiELG 

FYI*SLLIRLPFDVKRKGGGPSSIKPRPHYDPPSTA\DFKEQVIH 

HFVA VILMTFS YSANIJjR^GSLVLLIiHDS SDYLLEACKMVNYMQ 

YQQVCDALFLI FSFVFFYTRLVLFPTQI LYTTYYES I SNRGPFF 
Gr/FPNGLL^^JL^LLHVF^SCLILRMLYSFMKKGO^IEKDIRSDV 

EESDSSEEAAAAQEPLQLKNGTAGGPRPAPTDGPRSRVAGRLTN 
RHTTAT 


6960 


JO / 


?068 


AKWAREKEMQEF \TRS FF^RGRPDLSTLTHS IVRRRY LAHSGRS 
HLEPEBKQALKRIjVEEEPIJ(MQVDEAASREI?KLDLTKKGKRPPT 
PCSDPERKRFRFNSESESGSEAS S PDYPGPPAKNGVASRSHTHP 
KEENPRRA\ S KAVEE s sdeerqrdlp aqrgeess e EEEKG YKGK 

trkkpvvkkqapgkasvsrkqareeseeseaepvqrtakkvegn . 

kgtkslkeseqeseeeiiaqkkeqreeeveeebkeedeekgdwk 

prtrsngrrksareersckqksqakrllgdsdseeeqkeaassg 

ddsgrdreppvqrksedrtqlkggkrlsgssedeedsgkgepta 

kgspjcmarlgs^sgeesdlerevsdseagggpqgerknrsskks 

SRKGRTRSSSSSSDGSPEAKGGKAGSGRRGEDHPAVMRLKRYIR 
ACGAHRNYKXLLGSCCSHKERLS ILRAELEALGMKGTPSLGKCR 
ALKEQREEAAEVASLDVANIISGSGRPRRRTAWNPLGEAAPPGE 
LYRRTLDSDEERPRPAPPDWSHMRGIISSDGESN 


6961 


340 


1646 


RPWSSPTMKPNFSLRlAIFm^CWGIPYIiSKKRADRMRRLGDFL 
KQESFDLALLEEVWSEQDFQYLRQKLSPTYPAAHHFRSGI igsg 
LCVFS KH P I QELTQHIYTLNGYP YMIHHGD W FSGKAVGIiLV1jH1> 
SGMVLNAYVTHLHAE YNRQKDI YTiAHRVAQAWELAQFIHHTSKK 
ADVVLLCGBIiNMHPEDIiGCCIiKEWTGIiHDAYLETRDFKGSEEG 
NTMVPKNCYVSO^EIjKPFPFGVRIDYVIjYKAVSGFYISCKSFET 

ttgfdphrgtplsdhealmatlfvrhsppqqnpssthgp\aers 
pl/mc^o.kealdgslgi^maNqar wwa\tfa\ s yviglgl \ LL 

LAIiLCVIiAAGGGAGEAAI LLWTPS VGLVLWAGAFYLFHVQEVNG 
LYRAQ^AEU2HVLGRAREAQDLGPEPQLYALI>\LGQQEGDRTKSQ 
RPWS S PTMKPNFSLRLR I FNLNCWG I PYX.SKHRADRMRRLGDFL. 
NQESFDrAI^EWSEQDFQYliRQKLSPTYPAAHHFRSGXIGSG 
LCVFS KHP I QELTQHXYTLNGYP YM I HHGDW FSGKAVGLLVLHL 
SGMVLNAYVTHLHAEYNRQKD I YLAHRVAQAVIELAQFIHHTS KK 
ADVVLLCXSDLNMHPEDLGCCIjLKE^ 

ntmvpkncyvsqqelkpfpfgvridyvlykavsgfyi scks fet 
ttgfdphrgtplsdhealmatlfvrhsppqqnpssthgp\aers 
pl/mcvclkealdgslglgma\qarwwa\tfa\syviglgl\ll 
i^allcvlaagggageaal llwtps vglvlwagafylfhvqevng 

LYRAQAELQHVI^RAREAQDLGPEPQLYALL\LGQQEGDRTKEQ 


§"962 


340 


1646 


6963 


3 74 


2618 


RVTPL ILKLLKKPierAEKQKASEENE ITQPGGSSAKPGLPCLNF 
EAVIiSPDPALIHSTHSLTNSHAHTGSSDCDISCKGMTERIHS IN 
I^FSNSVLETI^EQRNRGHFO>VTVRIHGSMLRAQRCVIJ^ 
PFFQDKLLLGYSDIEIPSWSVQSVQKLIDFMYSGVLRVSQSEA 

LQILTAAS I LQ I KTV IDECTRI VS QNVGDVFPG I QDSGQDTPRG 
TPESGTSGQSSDTESGYLQSHPQHSVDRIYSALYACSMQNGSGE 
RSFYSGAWSHHETALGLPRDHHMEDPSWITRIHERSCXJMBRYL 

STTPETTHCRKQPRPVR1QTLVGNIHI KQEMEDD YD Y YGQ QR VQ 
ILERNESEECTBDtDQAEGTESEPKGESFDSGVSSSIGTEPDSV 

EQQFG PG AARDS Q AE PTQ P EQ AAEAP AEGG PQTNQLETG AS S P E 
RSNEVEMDSTVITVSNSSDKSVLQQPSVNTSIGQPLPSTQLYLR 

QTETLTSNLRMPLTLTSNTQV I GTAGNTYLPALFTTQPAG SGP K 
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NO: 
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nucleotide 
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corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
com spending 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, P= Phenylalanine, G=Glycine, 
H=Histidine , I=Isoleucine, K=Lysine, 
L= Leucine, M=Methionine, N=Asparagine, 
P= Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V= Valine , 
W=» Tryptophan , Y= Tyro sine , X-Unknown , *=»Stop 
Codon, /=possible nucleotide deletion, 
\ -possible nucleotide insertion) 








pft^fslpqpiiagqqtqfvtvsqpgi^tftaqlpapqplassagh 

stasgqgekkpyectlctktrtakqnyvkhmfvhtgekphqcsi 

cwrsfslkdylik\hmvthtgvrayqcsic^ 

rlhrge ksyecy i ckkkfshktllerhvalhs asngtp pagtpp 

garag p pgvvactegttyv cs vcpakfdq i eq fndhmrmhvs dg 


6964 


1 


178 * 


SGRPFTFPFSNTDVYPIKXVTNRWTAGSSYKMTRMKSIGKILLL. 
QI FIG\NCSMFVLVI 


6965 


757 


208 


WFIEPRIQGFMKTSAHPGQKHPDFSMGLLFPLLAALEVCSCGS 
SGS LGYNLPQNH\GIJ^RNTLVLI^QMRRISPFL,CLKDRSDFRF 
PQEKVEVS0IiQKA\QAMSFXYimjO^VPNFSHKAIiL\CCMEHDL 
PGPTPHFTS SAAGTPGDLLGAGDGRRRSWGQWV I EGSTLALRRY 
FQESISTLE 


6966 


820 

• 


1867 


IITALGVRGMPGCPCPGCGMAGPRLLFLTALALEIjLGRAGGSQP 
AIiRSRGTATACRLDNKESESWGAIjLSGERLDTW I CSLLGSLMVG 
LSGVFPLLVI PLEMGTMLRSEAGAWRLKQLLS FALGGLLGNVFL 
HLLPFJUCAYTCSASPGGEGQSI^C^QI^LWVIAGILTFLALEK 
/HVPGCX3GGGDQPGPCX2RPHCCC51RAQWRPLSGPAGCRARPRCR 
GP\DI KVSGYIjNLLANTIDNFTHGIAVAASFLVS KKIGLLTTMA 
ILL1IE I PHEVGDFAILLRAGFTDRWSAAKLQLSTAIiGGLIiGAGPA 
ICTQSPKGVEETAAWVLPFTSGGFLYIALVNVLiPDIJL.EEEDPW 


6967 


162 


633 


GFIiPFKYWIIiDLSASSRMETDCNPMELSSMSGFEEGSELNGFEG 
TDMKDMRliEAEAWNDVL FAVNNM FVS KSLRCADD VA Y I UVETK 
ERNRYCTiEI*TEAGI>KWGYAFIX}VDDHI>OTPYHETVYSI*IiDTL\ 
SPAYREAFGKR\LLQRI*EAI*KRDGQS 


6968 


1 


2265 


RGGGGGRGGPGARERERPGEPERTMEAAAGGRGCFQPHPGLQKT 
LEQFKLSSMSSLGGPAAFSARWAQEAYKKESAKEAGAAAVPAPV 
PAATEPPPVLHLPAJQPPPPVLPGPFFMPSDRSTERCETVLEGE 
TISCFVVGGE1CRLCI*PQIIiNSVIJU3FSIjOT 

CTADQLEILKVMGILPFSAPSCGLITKTDAERliCNALiIiYGGAYP 
PPCKKEIAASIJU^I£LSERSVRVYHE\CFGKCKGL\LVPELYS 
SPSAACIO^D\CmiMYPPHKFVVHSHKALENRTCHWGF\DSA\ 
NWRAYILLS QDYTGKEEQARLGR \ CLDDVKEKFDYGNKYKRRVP 
RVS SEP PAS I RP KTDDTSSQS PAPS EKDKPSS WLRTLAGS SNKS 
LGCVHPRQRLSAFRPWSPAVSASEKEIiSPHLPALXRDSFYSYKS 
FETAVAPNVALAPPAQQKWSSPPCAAAVSRAPEPLATCTQPRK 
RKLTVDTPGAPETLAP VAAPEEDKDS EAEVEVE S REEFTS S LS S 
LS S PS FTSS S S AKDLGS PGARALPS AVPDAAAPADAPSGLEAEL 
EHLRQALEGGLDTKEAKE KFLHEWKMRVKQEE KLS AALQ AKRS 
LHQELEFLRVAKKEKI^EATEAKRNI.RKEIERJ-RAENEKKMK^ 

NESRLRLKRELEO^RQARVCDKGCEAGRLRAKYSAQI EDLQVKL 
QHAEADREQLRADLiLREREAREKLE K\VVK\ELQEQLWPRARPE 
AAGSEG\ AAELE P 


6969 


1855 


118 


AGTMHGRLKVKTSEEQAEAKRIjEREQKIjKLYQSATQAVFQKRQA 

GELDESVLELTSQI LGAN P DFATLWNCRREVLQQLETQKS PEEL 
AALVKAEIiGPLESCLRVNPKSYGTWHHRCWLLGRLPEPNVTTREL 
ELCARPIiEVDERNFHCWD YRRFVATQAAVP PAE ELAFTDSLI TR 
NFSNYSSWHYRS CL.LPQLHPQPDSGPOGRLPEDVX.LKELELVQN 
AF FTD PNDQSAW FYHRWLLG RADPQDALiRCLHV S RD EACLTVS P 
SRPLLVGSRMEI LLLMVDDS PLIVEWRTPDGRNRPSHVWLCDLP 
AASLNDQLPQHTFRVI WTAGDVQKE CVLLKGRQEGWCRDS TTDE 
QLFR(^I,SVEKSTVIjQSELESCKELQELEPENKWCL\LTI illm 
RALDPLLYEKETLQ YFOTLK\AWDPKRATY\ LDDLRSKPLLENS 
VLKMEYAEVRVLHLJVHKDLTVLCHLEQIjIiVTHLD 
P PALAALRCLEDP P PRT \ VLOASDNAI ESLDGVTNLPRLQELLL 
CWRLQQPAVLQPLASCPRLVLLNLQGNPLCQAVGI LEQLAELL 

PSVSSVLT 
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beginning 
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to first 
amino acid 
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amino acid 


Predicted end 
nucleotide 
location 
corresponding 
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residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
<A=Alanine, C= Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenyl alanine, G=Glycine, 
H=Histidine, I-lsoleucine , K=Lysine, 
l»=Leucine, M^Methionine . N=Asparagine , 
P= Proline , Q=Glu t amine , RsrArginine, 
S=Serine, T=Threonine, V~Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 


6970 j 


3 


152B 


SFPPIjLSSPSAVGEGKVAVAAPCPGRSECARAXMAYIQLEPIxNE 
GFLSRISGI.LI>CRWTCRHCCQKCYESSCCQSSEDEVEILGPFPA 
QTPPWLMASRSSDKDGDSVHTASEVPLTPRTNSPDGRRSSSDTS 
KSTYSLTRR I SSLESRRPSS PLI DI KPIEFGVLSAKKEP IQPSV 
IMTYNPDDYFRICFEPHLYSLDSNSDDVDSLTDEEIIiSKYQIjGM 
LHFSTQYDIiLHNHLTVRVI EARDliP PPISHDGSRQDMAHSNPYV 
KICLLPDQK^SKG^GVKRKTQKPVFEERYTFEIPFLEAQRRTLL 
LTVVDFDKFSRHCVTGKVSVPLCEVBI,VKGGHWWKALIPSSQNE 
VEI^EIJ^r^LNYLPSAGRIJ^VDVIRAKQLLQTDVSQGSDPFVKI 
QLVHGLKLVKTKKTS FLRGTIDPFYNESFSFKVPQEELENASLV 
FTVFGHN>^SNDFIGRIVIG\QYSSGP\SEPNHWRRMLNTHRT 
AVEQ WHS LRSRAECDRVS PASLEVT 


6971 


3 7 


3 702 


ACFYVPGSRS FKUX PRHGLVNMGRSGKLPSG VSAKDKRWIGCGHS 
SDSNPAI CRHRQAARSRFFSRPSGRSDLTVDAVKLHNELQSGSL 
RliGKSEAPET PMEF.K AELVXiTEKS SGTFLSGliSDCTNVT FSKVQ 
RFWESNSAAHKE I CAVIiAAVTEVI RSQGG KETETEYFAAL I RKA 
AQHGVCSVLKGSEFMFEKAPAHHPAAISTAKFCIQEIEKSGGSK 
EATTTliHMLTLLKDIiLPCFPEGLVKSCS ETL.LRVMTLSHVL.VTA 
CAMOAFHSLFEARPGIiSTI*SAEI*NAQ 1 1 TALYD YVPSENDL»QPI* 
LAWI^CVraKAHI^VRLQWDLGI^^ 

LTAATQSLKE I LKECVAPHMADIGS VTS SASGPAQSVAKM FRAV 
EEGIiTYKFHAAWSSVLQLIiCVFFEACGRQAHPVMRKCliQSliCDL 
RI^PHFPHTAAIiDQAVGAAVTSMGPEVVIiQAVPLEI DGSEETL.D 
FPRSWLL.PVIRDHVQETRLGFFTTY FLPLANTLKSKAMDLAQAG 
STVESKIYDTIX^WQMWTLLPGFCTRPTDVAISFKGLARTI^MAI 
SERPDI^VTVCQAI^TLITKGCX^AEADRAEVSRFAKNFLP I LFN 
LYGO PVAAGDTPAPRRAVLETIRTYliTI TDTQLVNSLLEKAS EK 
VLDPASSDFTRLSVlaDLWAIAPCADEAAISKX.YSTIRPYLESK 

AHGVQ KKAYRVLEEVCAS POGPGAL FVQSHLEDLKKTLLDS IjRS 
TSS PAKRPRLKCIJiHrVRKLSAEHKEFITALIPEVI LCTKEVSV 
GARKNAFALLVEMGHAFLRFGSNQEEALQCYLVLIYPGLVGAVT 
MVSCS IIALTHLbFEFKGliMGTSTVEQLLENVCLLLASRTRDVV 
KSAIiGFIKVAVTVMDVAHIJUCHVQLVMEAIGKLSDDMRlUiFRMK 
LRNLFT\FCFIPK\FGILTVJGKKAVGPKEYHRVI,VNIRKAEARAK 
RHRALSQAAVBEEEEEEEEEEPAQGKGDSIEEILADSEDEEDKfE 
EEERSRGKEQRK1»ARQRSRAWLKEGGGDE PI1NFI1DPKVAQRVI1A 
TQPGPGRGRKKDHS FKVSADGRI*I IREEADGNKMEEEEGAKG ED 
EEMADPMEDVI IRNKKHQKIiKHQKEAEEEELE I PPQYQAGGSGI 
HRPVAKKAMPGAE YKAKKAKGDVKKKGRPDP YAY I PLNRS KLNR 
RKKMK1»QGQFKG1»VKAAQRGSQVGHKNRRKDRRP 


6972 


j 2179 


973 


PGGAILLPLWRRTRPREATVPRGAAQRGRARSAEGRI PSSQSPS 
PAEAGGATRS PPPRPPRPARPPGPSAPPLLRSDAGPGATVSAAA 
AAATERARRGATMGAQLS T LG H MVLF P VWFLYS IiLM KLFQRS T P 
AITLBS PDIKYPLRXIDREI ISHDTRRFRFAIf PS PQHILGLPVG 
QHIYLSARIIX5NLVVRPYTPISSDDDKGFVDLVIKVYFKDTHPK 
FPAGGKMSQYLESMQIGDTIEFRGPSGLLVYQGKGKFAIRPDKK 
SNPIlRT\nCSVGMIAGGTGITPMLQVIRAlMKDPDDHTVCHLLF 

anqtekd iliiirpei>eelrnkhsarfkl»wytl»drap eawdygog \ 
fvneemirdhlpppe\eeplvlmcgpppmiqyaclpnl\dhvgh 

ptercfvf 


6973 


\ 1 


1964 


lqprcahrgu^qkosrpapgvdamvlcpvigkl^krvvii^ 
sprrqeilsnagiirfewpskfkekldkasfatpygyametakq 
kalevanrlyqkdlrapdwigadtivtvgglilekpvdkqday 
rmlsrfe / sgrehs vftgvai vhcsskdhqldtrvs efyeetkv 

KFSELSEELLWEYVHSGEPMDKAGGYGIQALGGMIiVESVHGDFL 
NVVGFP3JNHFCKQLVKLYYPPRPEDLRRSVKHDS IPAADTFBDL 
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ID 
NO: 


predicted 
beginning 
nucleotide 
location 
corre spending 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C= Cysteine, D=Aspartic Acid, E= - 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
Ii=Leucine, M=Methionine f NssAsparagine , 
P-Proline, Q=Glutamine, R=Arginine, 
S= Serine, T= Threonine , V=Valine, 
W= Tryptophan, Y=Tyrosine, X= Unknown , * ^Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








SD VEGGGS EPTQRDAGS RDB KAEAGEAGQATAEASCHRTRETLP 
PFPTRLLEuIEGFMI*SKG1jLTACKI*KWDLLKDEAPQKAADIAS 
KVDASACGMERLLD I CAAMGLLEKTEQGYSNTETANVYLASDGE 
YSLHGPI MHNNDLTWNLFTYLE FAIREGTNQHHRALGKKAEDL P 
QDAY YQS P E TRLR FMRAMHG MTKLTACQ VATAFNLS R FS S ACD V 
GGCTGALARELAREYPRMQVTVFDbPDI I EIiAAHFQPPG POAVQ 
IHFAAGDFFRDPIjPSAEIjYVLCRIIjHDWPDDKVHKLI»SRVAESC 
KPGAGLLLVETLlJ)EEKRVAQRAL^K3SIiNMLVQTOGKERSIiGEY 
QCLLEI/HGFHQVQWHIiGGVLDAILXPPKWPPEAQAACSI, 


6974 


3082 


2172 


RSCAAFAS FASRPP IiELFAPPGSHRSPPGRG VATSAQCALSVRK 
LLAARPGLGTKYQATMVYKTLFALCI LTAGWRVQS I>P TSAPLSV 
SLPTNIVPPTTIWTSSPQNTDADTASPSNGTHNNSVLPVTASAP 
TSltLPKNISIESREEEITSPGSNWEGTNTDPSPSGFSSTSGGVH 
LTTTLEEHSIjGTPEAGVAATLSOSAAEPPTLI SPOAPASSPSSL 
STS PPEVFSASVTTNHSSTVTSTQPTGAPTAPESPTEESSSDHT 
PTSHATAEPVPQEKTPPTT vsgkvmceli dmet\pp p fpg 


6975 


2 


500 


R PR PTVHCCKWALKLETAMETLINVFHAHSGKEGDKYXLS KKEL 
KELLQTELSGFLDVKELML+ATEALKTFEEA* kspi iqcsssrs 
S LP PAPQP P P YL* I«S AVPFP I HLPL»PLLP PQAQKDVDAVDKVMK 
ELDEHGDGEVD FOE Y WLVAALTVACNNFFWENS 


6976 


1216 


970 


GCQJL+ VAYGTTENS PVTFAHPPEDTVEQKAES VGR IMPHTEAR I 
MNMEAGTLAKIJOTPGELCIRGYCVM 

YWTGDVATMNEQGFCKI VGRSKDMI IRGGENI YPAELEDFFHTH 
P KVQEVQ WGVKDDRMGEE I CACI RLKDGEETTVEE I KAFCKGK 
I SHFKI PKYI VFVTNYPLTI SGKI QKFKLREQMERHLNL+ 1 KQQ 
ACPGRIA 


6977 


1298 


588 


SIiFINTNLLSNQIRKTSFGMCSEPISDNTEDQKGKIjKTPDFA*R 
ANKKSKHHVNGNRTVEPFPEGTQI^VFGT^CFWGAERKFWVTjKG 
VYSTQVGFAGGYTSNPTYKE^CSEKTGHAEVVRVVYOPEHMSFE 
ELLKVFWENHDPTTCMRQGNDHGTQYRSAIYPTSAKQMEAALSS 
KEN YQKVLS EHG FGPI TTD IREGQTF YYAEDYHQQY1>S KNPNG Y 
CGIX3GTGVSCPVGIKX 


6978 


3 


242 


SFPFRDSRRCGCC^GSSLRHTAVAMVKLSKBAKQRIjQQ1.FKGSQ 
FAIRWGFIPLVTYLGFKRGADPGMPEPTVLSLIiWG 


6979 


3917 


1146 


DEARVRGEAVAAAIfcSRCRHWSGPPPFPPSPPDRKGLRGTEPWE 

AGPGSGATPGARAMDVRR1jKVNELREEI/3RRGLDTRGLKTEXAE 

RLQAA3^EAEEPDDERELDADDEPGRPGHINEEVETEGGSEIiEGT 

AQPPPPGI^PHAEPGGY-SGPDGHYAWDNITRQNQFYDTQVIKQE 

NESGYERRPLEMEO^QAYRPEMKTEMKQGAPTS FLPPEASQUCP 

DRQQ FQSRKRP YE ENRGRG YFEHREDRRGRS PQPPAEEDEDDFD 

DTLVAXI/I^CPIiHFKVARDRSSGYPLTIEGFAYLWSGARASYG 

VRRGRVCFEM KI NEE I SVKHLPSTEPDPHWR I G WSLDSCS TQL 

GE E P FS YGYGGTGKKSTNSRFENYGDKFAENDV I GCFADFE CGN 

DVEIiS FTKNGKWMGIAFR IQKEALGGQALYPHVLVKNCAVEFNF 

GQRAE PYCSVLPGFTFIQHLPLSERI RGTVGPKSKAECEILMMV 

GIiPAAGKTTWAIKHAASNPSKKYNILGTNAIMDKMRVMGLRRQR 

NYAGRWDVLIO^ATQCT^RLIQIIAARKKR^^ 

RKMRP FEGFQRKAIVICPTDEDLKDRTI KRTDEEGKDVPDHAVL 

EMKANFTbPDVGDFIJDEVIjFIELQREEADKLVRQYNEEGRKAGP 

PPEKRFDNRGGGGFRGRGGGGGFQRYENRGPPGGNRGGFQNRGG 

GSGGGGNYRGGFNRSGGGGYSQNRWGNNNRDNNNSNNRGSYNRA 

PQQQPPPQQPPPPQPPPQQPPPPPSYSPARNPPGASTYNKNSNI 

PGSSANTSTPTVSSYSPPQSFGFFPSTFQPSYSQPPYNOGGYSQ 

G YTAP P P P P PPP PAYN YGS YGGYNPAP YTPPPPPTAQTYPQP S Y 

NQ YQQ YAQQWNQ YYQNQGQW P P YYGNYDYGS YSGNTQGGTSTQ 


6980 


1 


420 


GTRGRKTGRVAAPSTRRRTG1WQKJX3TRSPAMSLSDPGLGYHPT 
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location 
corresponding 
to first 
amino acid 
residue of 

ami sa r™ 4 ^ 

CttU-L IVJ dLXU 

sequence 


rlculCucu ena 

nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 


iwu.no aciu segmezic containing signs J. pept lue 
(A=A1 anine, C=Cysteine, Ds=Aspartic Acid, Ea 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Histidine, I-Isoleucine , K=Lysine, 
I>= Leu cine , M=Methionine, NsAsparagine, 
P= Proline, Q^Glutamine, R=Arginine, 
S=Serine, T=Threonine , V-Valine, 
W=Tryptophan , Y~ Tyro sine, X=Unknown, *=Stop 

\=possible nucleotide insertion) 








CWTIaRWPPLCSLHALHVFHCl.FSSRIiGTPVS PRLAMDPNCS CEA 
GGSCACAGSCKCKKCKCTSCKKSCCSCCPLGCAKCAQGCICKGA 
SEKCSCCA 


6981 


10 


1054 


PGRG FRJRASLRPAFAARG VFQGGLGQAKQARTRACAALPTPH P S 
APRLLEPQGVFSLFPPPPGPWPNMIIjTKAQYDEIAQCLVSVPPT 
RQSLRKLKQRFPSQSQATLLS I FSQEYQKH I KRTHAKHHTSEA I 
ESYYQRYIiNGVVKNGAAPVIiDIANEVDyAPSLMARIil LERFLQ 

VDCIKHAIGHEHBVLLRDLLLEKNLSFIjDEDQLRAKGYDKTPDF 

ilqvpvaveghiihwieskasfgdecshhaylhdqfwsywnrfg 
pglviywygfiqeldcnrergillkacfptnivtlchsia 


6982 


1S3 


1285" 


FPQQDCSAPAAPGI*AGSEPRRLRAYRRRRQRARGLKRVAWbAP P 
PSIJuQGI^WAQAPVDGTIiGPEDSRASSPMIQNSRPSLLQPQDV 
GDTVETLMLHP VI KAFLCGS I SGTCSTUiFQPLiDLJjKTRIjQTIjQ 
PSDHGSFJlVGMIiAVI^KVVRTESLIiGXW 

YFGTLYSLKQYFLRGHPPTALESVMLGVGSRSVAGVCMS pi tvi 
KTRYESGKYGYESI YAALRSI YHSEGHRGLFSGLTATLLRDAPF 
SGI YXMFYNQTKNIVPHDQVDATIil PITNFS CGI PAG ILASLVT 
QPADVI KTHMQIiYPLKFQwIGQAVTLI FKDYGIaRGFFQGG I PRA 
LRRTLMAAMAWTVYEEMMAKMGLKS 


6983 


82 


773 


EMS FLQDPS FFTMGMWS IGAGALGAAALALKLANTDVFLS KPQK 
AALEYIiEDIDLKTI^KEPRTFKAKELWEKNGAVIMAVRRPGCFIi 
CREEAADLS S LKSMLDQLGVPLYAWKEH IRTEVKDFQP YFKGE 
I FIjDEKKKFiGPQRRKMMrTaGFIRl^VWyW^ 
GEGFILGGVFWGSt3KG^ILLEHREKEFGDKVNLI*SVLEAAKMI 
KPQTIASEKK 


6984 


1845 


1282 


GGRSAYSJ^AGSIjPRVPATAAAKMASGVQVADEVCRIFYDMKVR 
KCSTPEEIKKRKKAVI FCLSADKKCI I VBEGKE I LVGDVGVT 1 T 
DPFKHFVGMI*PE KDCRYAIiYDAS FETKESRKEELMFFLWAPEIA 

GSLIVAFEGCPV 


6985 


1887 


1324 


RRTAG IYPCFP KPGRTRHALCS WLLUL.TGQLAFDD FQESCAMM 
WQKYAGSRRSMPLGARILFHGVFYAGGFAIVYYLIQKFHSRALY 
YIGAVEQI^SHPEAQEALGPPLNIHYIiKIjIDREKFVDIVDAKLK 
IPVSGSKSEGLLYVHSSRGGPFQRWHLDEVFLEIjKDGQQIPVFK 
LSGENGDEVKKE 


6986 


642 


1350 


YHLY F KMGDPNS RK KQALNRL.RAQLRXKKESLADQ FDFKM Y IAF 
VFKEKKKKSAIjFEVSEVIPVMTNNYEENII^GVRDSSYSLESSI* 
EIjLQKD WQLHAPRYQS MRRDVIGCTQEMDFIIjW PRNDIEKI VC 

t.T . PS P WK F <? DT? P FT? P VO AKFE FHHGD YEKO FLHVIiS RKD KTG I V 
VNNPNQSVFLF IDRQHLQTPKNKATI FKLCSICLYL,PQEQLTHVJ 
A VGT I EDHLRP YMPE 


6987 


1623 


341 


LtEAAEKASRAFKESQRQTDS KNYETENWSPQKSQRRYDM YNTAC 
FLGEI EVGLYTIQ I I^I*TPFFlQCENELSKKHMVQFLSGiCWTI PP 
DPPJ^CYIAI^KFTSHLKNI^SDLKRCFDFFIDYMVLLKMRYTQ 
KEIAEIMI>SKKVSRCFRKYTELFCHIJDPCLIiQSKESQLLQEENC 
PJKKLEALRADRFAGIjLiE YLNPNYKDATTMES IVNEYAFLLQQNS 
KKPMTNEKQNS ILANI IL^CLKPNSfCL IQPLTTLKKQLREVLQF 
VGLSHQYPGPYFIACLLFWPENQEIJX5DSKLIEKYVSSLNRSFR 
GQYKRMCRS KQAS TIiF YLGKRKGLNS I VHKAKIEQY FDKAQNTN 
SLWHSGDVWKKNE VKDLLRRLTGQAEGKL I S VEYGTEEKIKI P V 
I SVYSGPLRSGRNI ERVSFYLGFS IEGPPGL 


6988 


3 


689 


TQLIiRRPAVFVGSAASG I RSGLWSASSGHWCAP AAGRAHAPVPR 
LVRGIiGAASTAAPQDAQTGPQPMPRADC IMRHLPYFCRGQWRG 
FGRGS KQLG I PTAN F P EQWDNLP AD I STG I YYGWAS VGSGDVH 
KMWS IGWNP YYKNTKKSMETH IMHTFKEDF YGE ILNVATVGYL 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A-Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M= Methionine , N=Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W= Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








RPEKNFDSLESLISAIQGDI BEAKKRIiELPBHIiKI KEDNFFQVS 
KSKIMNGH 


6989 


2 


1118 


LMPSDRPLS PSTHASAGSHCHAPPTTARRAFPI PFGSKSNMATL 
KDQLI YNLLKEEQTPQNKI TWGVGAVGMACAI S I LMKDLADE L 
ALVDVI EDKLKGEMMDLQHGS L FLRTP K I VS GKDYNVTANSKLV 
I ITAGARQQEGESRLNLVQRNVN1FKFI IPNWKYSPNCKLLIV 
SNPVDILTYVAWKISGFPKNRVIGSGQILDSARFRYLMGERIjGV 
HPLSCHGWVLGEHGDSSVPWSGMNVAGVSLKTIJIPDIiGTDKDK 
EQWKEVHKQVVESAYEVI KLKG YTSWAI GLS VADIoAES I MKNQLR 
RVHPVSTMIKGLYGI KDDVFI»SVPCILG<}NGISDIiVXVTLTSEE 
BARLKKSADTLWGIQKELQF 


6990 


719 


258 


THASGMAS WLALRTRTAVTS LLS PTPATALAVR YAS KKSGGS S 
KNLGGKSSGRRQGI KKMEGHYVHAGNI I ATQRHFRWHPGAHVGV 
GKNKCLYAl>EEGIVRYTKEVYVPHPRNTEAVDLITRIjPKGAV3JY 
KTFVHWP AKP EG TF KLVAML 


6991 


169 


451 


RRSS DFHN PGFLS R P VSLREN I HHQVI CSTKNKRRN PKKXAYI*L 
S S LLMTNLN PNES TENQP VDAYX-IAFTLDQE FI,TYACVEGTGCLF 
CGRHVH 


6992 


944 


510 


RQAPGCSSLAI^QWQVYCGLVRAPQVQTRPLSSRFVERRGAIjY 
RS PMNQENP P P YPGPG PTAP YP P YP PQPMGPGPMGGP YPPPQG Y 
PY^YPQYGWC^PQEPPKTTVYVVEDQRRDELGPSTCIjTACWT 
ALCCCCIiWDMIiT 


6993 


a 


374 


QWCVTCPQHNARQGPAVPPGIQAYGAAPFED1K3VDFTEMSKCRG 
DRVWI KNWNVASI^PLMKGPQTVVI^PPTAVKVEGI PAWIHHSH 
VKPAARETWEARPS PDNP FRVTLKKTTSPAP VTPGS 


6994 


346 


1100 

* 


QWPEKDPVMAASS ISS P WGKHVFKAI LMVLVAXlIiHSAIAQSR 
RDFAPPGG^KREAPVDVLTQIGRSVRGTLDAWIGPETMHLVSES 
SSQVLWAISSAI SVAFFAIiSGXAAQLLNALGIiAGDYlAQGLKLS 
PGQ VQTFLLWGAGAIiWYWLLS LLLGLVLALLGR I L WGL.KLVI F 
LAG FVALMRS V PDPSTPJujIiLIiAI*I>I LYALLS RliTGSRASGAQI* 
EAKVRGLERQVEELRWRQRRAAKGARSVEEE 


699S 


144 


1346 


GSVAVGLSG IMAAQKDLWDAI VIGAGIQGCFTAYHTAKHRKRII* 

LLEQFFT* PHS RGS SHGQSR I IRKAYW2D F YTRMMHECY Q I WAQI* 

EHEAGTQIiHRQTGT.TJ.I^MKENQEL.KTIQANI.SRQRVEHQCLSS 

EELKQRFPNIRI*PRGEVGIJjDNSGGVIYAYKAIjRAIiQDAIRQLG 

GIVRIXSEKWEINPGLLVTVKTTSRSYQAKSLVITAGPWTNQIilj 

RPLGI EMPLQTLRIN VCYWREMVPGS YGVSQAFPCFLWI*GI»CPH 

HI YGLPTGE YPGLMKVS YHHGNHAD PEERDCPTARTD IGDVQI L 

SSFVRDHLPDLKPEPAV1ESCMYTNTPDEQFILDRHPKYDNIVI 

GAGFSGHGFKLAPVVGKILYELSMKLTPSYDLAPFRISRFPSIiG 

KAHL 


6996 


543 


1942 


ETANAEAAARKSAMDWKEVXiRRRIiATPNTCPNKKKSEQEIiKDEE 
MDLFTKYYSEWKGGRKNTNEFYKTI PRFYYRLPAENEVLLQKLR 
EESRAVFLORKSRELLDNEELQNLWFljLDKHC/rPPMIGEEAMIN 
YEN FLKVG E KAG AKCKQ F FT A KVFAKLLHTDS YGR I S I MQ F FNY 
VMRKVWLHG/TR 1GLSLYDVAGCXJYI»RESDI*ENYII^I*IPTIjPQL 
IX3LEKSFYSFYVCTAVRKFFFFIJ)PIJiTGKIKIQDILACSFI*DD 
LLELRDEELS KESQETNWFS APSALR VYGQ YIiNLDKDHNGMLS K 
EEI^RYGTTVTMTNVFLDRVFOECLTYDGE 

RKEPAALQ Y I FiCLLDIENKG YLNVFS LNYFFRAIQELMK I HGQD 
PVS FQDVKDE I FDMVKPKD PLKISLQDI*INSNQGDTVTT I L I DL 
NGFWTYENREALVANDSENSADLDDT 


6997 


370 


1104 


AMBLTIFILRLAI YILTFPLYLLNFIK3LWSWI CKKWFPYFLVRF 
WIYNECMASKKRELFS^QEFAGPSGKIiSLLEVGCGTGANFKF 
YPPGCRVTCIDPNPNFEKFLIKSIAENRHLQFERFWAAGENMH 
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SEQ I Predicted 
ID J beginning 
NO: I nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



Predicted end 
nucleotide 
location 
corre sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



6998 



616 



6999 



14 



1591 



7000 



827 



7001 



2056 



844 



7002 



1043 



4 98 



Amino acid segment containing signal peptide 
(A»Alanine, C=Cysteine, D*=Aspartic Acid, E= 
Glutamic Acid, F=» Phenylalanine, G=Glycine, 
H=Histidine, I^Isoleucine, K=I*ysine, 
L=I*eucine, M=Methionine, N=Asparagine, 
P=Proline, 0=Glutaraine, R=Arginine, 
S=Serine, T=Threonine , V= Valine , 
W-Tryptophan, Y= Tyrosine, X= Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 

\=poseible nucleotide insertion) 

QVADGSVDVWCTbVL.CS VKNQER I IjREVCRVIjRPGGAFY FMKH 
VAAE CSTWNY FWQQVLD PAWHLL»FDGCNI»TRE S WKAliERAS FS K 

LK LQHIQAPLS WEbVRPHI YGYAVK 

FVSRAIjL.RVRSRRKPAEERAAPGRPEDAPIECPGATNCPEPL.WC 
SHLPVPYAPPTMESRGKSAS S PKPDTKVPQVTTEAKVPPAADGK 
APLTKPSKKEAPAEKQQP PAAPTTAP AKKTS AKAD P ALLNNHSN 
IJ^APTVPSSPDATPEPKGPGDGAEEDEAASGGPGGRGPWSCEN 

FNPLLtVAGGVAVAAl A1>I IGVAFLVRKK 



GRAGAC^RRDTAMSIEIESSDVIRLIMQYLK£NSLHRALATLQE 
ETTVSLNTVDS I ES FVAD I NSGH WDTVLQAI QS LKLPDKTL I DL 
YEQVVl^iaEIJUXGAARSI>LRQlTJPMINU^O^QPE 
LARS YFDPREAY PDGSS KE KRRAAI AQALAGEVS WPP SRLMAL 
U3QALKWQQHQGLLPPGMTIDLFRGKAAVKDVEEEKFPTQLSRH 
IKFGQKSHVECARFSPDGQYIjVTGSVDGFIEVWNFTrGKIRiCDL 
KYQAQpNFMMMDDAVLCMCFSRDTEMIiATGAQDGKI KVWKI QSG 
QCI^FERAHSKGVTCLSFSKDSSQILSASFDQTIRIHGLKSGK 

TLKE FRGHSS FVNEATFTQDGHY 1 1 S ASSDGTVKI WNMKTTECS 
NTTFKSLGSTAGTDITWSVILLPKNPEHFWCI^SNTVVIMNMQ 
GQ1VRSFSSGKREGGDFVCCALSPRGEWIYCVGEDFVLYCFSTV 
TGKLER TLTVHEKDV1G IAHHPHQNX.IATYSEDGLLKLWKP 
GPGWFLiEIiMESEGPPESERSEFFSQREEENEEEEAQEPEETGP 
KNPIiLQPALTGDVEGLQKI FEDPENPHHEQAMQLLLEEDIVGRN 
LLYAACMAGQSDVI RALAKYGVNLNE KTTRGYTLL»HCAAAWGRIj 
ETTjKAIjVEIjDVDI EALNFREERARDVAAR YSQTECVEFIjDWADA 
RLTLKKY IAKVS1AVTDTEKGSGKLLKEDKNTI LSACRAXNEWL, 
ETHTEAS INELFEQRQQI^DIVTPIFTKMTTPCQVTCSAKSVTSH 

DQKRSQDDTSN 



RRCIiIIAFLKGCFIFIYFIFIFETEFLSCCPGWSAVAQSRilAN 
FASQVQAIFILPKDSQVGPDVKSEAAPKRAL.YESVFGSGEI CGP 
TSPKRIiCIRPSEPVDAVVWSVKHDPLPl^EANGHRSTNSPTI 
VSPAIVSPTQDSRPNMSRPLITRSPASPLNNQGIPTPAQLTKSN 
APVHI DVGGHMYTS S LATLTKYPESR I GRLFDGTEP IVTiDS LKQ 
HYFIDRLXSQMFRYII^FLRTSKLL.IPDDFKDY'TLLYEEAKYFQL 
QPM1JLEMERWKQDRETGRFSRPCECLVVRVAPDLGERITLSGDK 
SLtlEE VF P E I GDVIlCNSVNAGWNHDSTHVIRFPIiNGYCHLNS VQ 
VliERLQQRGFEIVGSCGGGVDSSQFSEYVIjRRELRRTPRVPSVI 

RIKQEPLD 



7003 



818 



61 



121 



2285 



PMPSSTRWTTS*TYTDT5 SAWACRPTTGTCT*TAAPCjPTVRWWP 
TPCSRHQSRRRLTCWCSTSRPCGR*GGLCVRTAPTRPTTSASSS 
SWTSAGTSWPAGRRTGrATSGTATTTSVWPGCGTRMWSTQWSSV 
PRSRSCCSRPATTPPSKPGAPHAPCASSRHLAHGIiAPSSPGLPA 

RGAEVC _ 

QGRFRAFCTQRDFl^PPGMRliSALLAI^KVIXPPHYRYGMSPP 

GS VADKRKNP P W I RRRPWVE PIS DED WYLiFCGDTVE I LEGKDA 
GKQGKWQ VI RQRKWVWGGLNTH YR Y I G KTMD YRGTMI PS EAP 
LLHRQVKLiVDPMDRKPTEI EWRFTEAGERVRVSTRSGRI I PKPE 
FPRATCIVPETWIDGPKOTSVEDAI^RTYVPCLKTl^EEVMEAM 

G I KETR \ NTRRS I G I BPGAEQLLPNFCPS LEG 
FIJjPVIiTSRSLRQPAVPHARLGGVEPAAMKBARAKTPRKPTVKK 
G\PKRTlJCrQIiG/YYCRVRPIX3FPLX2ECCIEVIIWTTVQLHTPE 
GY RLNRNGD YKETQ YS F KQ V FGTHTTQ KE LFD WAN P L» VN D L I H 
GKNGl*LFTYGVTX3SGKTHTiitTGSPGEGGLIiPRCiDMIFlIS IGSF 
QAKR YVFKSNDRNSMDIG^EVTJAIJ^RQKREAMPNPKTSS SKRQ 
VDPEPADMITVQEFCKAEEVDEDSVYGVFVSYIEIYNNYIYDLL 
EEVPFDPINPNLHNIiNCFVKI K2STH1WYVAGCTEVEVKSTEEAFE 
VFWRGQKKRRIANTHLNRESSRSHSVFNI KLVQAP LDADGDNVL 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Pre di c t ed end 
nucleotide 
location 
corre spend i n g 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I*=Isoleucine, K=I>ysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P= Proline , Q=Glutamine, R=Arginine, 
SssSerine, T^Threonine, V- Valine , 
VJ=. Tryptophan , Y«Tyrosine, X-»unknown, *=»Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








QE KEQ I T I SQLkS LVDLAGSERTNRTRAEGNRIjREAGN INQSLMT 
LRTCMDVIjRENQMY GTNKMVP YRDS KLTHLFKNYFDGEGKVRM I j 
VCVNPKAEDYEENI^WKFAEVTQFA/EVARPVDKAI CGLTPGRR 
YRNQPRGP \ IGNE PLVTDWLQS FPPLPSCE I LDINDEQTLPRL 
I EALEKRHNI^RQMM I DE FNKQSNAFKALLQEFDNAVXS KENHMQ 
GKbNEKEKMI SGQKLE IERLEKKNKTLiBYXIEII^KTTTIYEED 
KRNI>QQELETQNQKIX2RQPSDKRRbEARlX^MVTETTMK^ 
ERRVAAJCQLEMQNKLWVKDEKLKQLKAIVTEPKTEKPERPSRER 
DREKVTQRSVSPSPVPVSYL 


7005 


63 


876 


RNMALYQRWRCLRLQGI.QACRLHTAVVSTPPRWLAERLGbFEEI, 
WAAQVKRLASMAQKEPRTIKISLPGGQKIDAVAWNTTPYQIARQ 
ISSTLADTAVAAQVNGEPYDIxERPLETDSDLRFLrTFDS PEGKAV 
FWHSSTHVIXlAAAEQFLGAVLOiGPSTEYGFYHDFFljGKERTI R 
GSEI^VLERICQELTAAARPFRRIjEASRDQLRQLFKDNPFKLHIj 
IEEKVTGPTATVYGCGTLVDI*CQGPHIiRHTGQIGGLKliI*SNSSS 
LWRSSG 


7006 


22 


898 


NAFGRHSTAVKMAAAAWIiQVLPVI LLLLGAHPSPLS FFSAGPAT 
VAAADRSKWHIPIPSGKNYFSFGKILFRNTTIFLKFDGEPCD1.S 
LNI TWYXiKSADCYNE I YNFKAEE VEL.YLE KLKEKRGLSGKYQTS 
S KbFQNCSELFKTQTFSGDFMHRIiPIiIiGEKQEAKENGTNIiTFIG 
DKTAMHEPI*QTWQDAP YI FI VHI G I SSS KES S KENSLSNIiFTMT 
VEVKG P YE YLTLEDYPLMI FFMVMC IVYVLFGVLWI*AWS ACYWR 
DLLRIQFW I GAV I FXiGMUSKAVF YAGFQ 


7007 


2 


1001 


AMTVSGPGTPEPRPATPGASSVEQIjRKEGNELFKCGDYGGALAA 
YTQAIX5LDATPQDQAVLHRNRAACHLKI»ED YDKAETEAS KAI E K 
DGGDVKALYRRSQAiEKI/3RlJX^VIJ)I^RCVSLEPKNKVFQEA 
LRNIGGQIQEKVRYMSSTDAKVEQMFQIIiLDPEEKGTEKKQKAS 
QNXfWXtAREDAGAEKI FRSNGVQLIiQRIJjDMGETDLMLAAXiRTI* 
VGICSEHQSRTVATLS ILGTRRWSIIjGVESQAVSI»AACHLI*QV 
MFDALKEGVKKGFRGKEGAI I VGE WKQVWGbLD VTVMEGMGI»S Q 
PGQFFGDQTCSCRIiFGIRFGDI I L»L» 


7008 


70 


1478 


CRSALGHERPPPAHIiPAGGRRLQTCPRSCRWLGRPPSGIiPPGPR 
SPPPIiAGPGQKMVQKKPAFJLQGFHRSFKGQNPFELAFSIiDQPDH 
GDSDFGLQCSARPDMPASQPIDI PDAKKRGKKKKRGRATDSFSG 
RFEDVYQLQEDVLGEGAHARVQTCINLITSQEYAVKI I EKQPGH 
IRSRVFREVEMLYQCQGHRNVLELIEFFEEEDRFYbVFEKMRGG 
S IIjSHIHKRRHFNELEASVWQDVASALDFLHNKGIAHRDIjKPE 
N I LiCEHPNQVS PVK I CDFDLGSG I KLNGDCS P I STPEI*I*TPCGS 
AE YMAPEWEAFS EEAS IYDKRCDLWSIiGVIIiYILLSGYPPFVG 
RCGS DCGWDRGEACPACQNMLFES IQEGKYE FPDKDWAHISCAA 
KDLISKLIiVRDAKQRLSAAQVLQHPWVQGCAPENTIjPTPMVLQR 
WDSHFliLPPHPCRIHVRPGGLVRTVTVNE 


• 7009 


1 


626 


ARQbRNSWVDDFVAAPbl P1»SQQI PTGNSLYES YYKQVDPAYTG 
RVGASEAALFLKKSGLSDIILGKI WDLADPEGKGFbDKQGFYVA 
LRbVACAQSGHEVTLSNLNI*SMPPPKFHDTSSPI/WTPPSAEAH 
W AVRVEE KAKFDG I FESIJLPIWGI^SGDKVKPVLMNSKIiPLDVL 
GRVWDLSD IDKDGH1.VRDEFAVAMHLVYRALE 


7010 


79 


571 


S HTRRAWPETLLS PLCPUjGGGTAMSGGEQKPERYYVGVDVGT 
GSVRAAI>VT)QSGVLLAFADQPIKNWEPQFNHHEQSSEDIWAACC 
WTKKWQG IDLNQI RGLGFDATCSIiWLDKQFHPIiPVNQEGDS 
HRNVI M WLDHRAVS QVNR I NETKHS VLQYVGG 


7011" 


3 


994 


R I QTLPNQNQ SQTQPLLKTP PAVI»Q P IAPQTTFG VQTQPQPQSL 
LQAQISAAS ITPLLQTQPQPLLQQPQQKAGLLQPPVRIVSQPQP 
ARRIiDPPSRFSGRNDRGDQVPNRKDDRSRERERERRRSRERSPQ 
RKRSRERSPRRERERSPRRVRRWPRYTVQFSKFSUDCPSCDMM 
ELRRRYQNTjYIPSDFFDAQFTWVDAFPLSRPFQI^OTCNFYVMH 
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amino acid 
residue of 
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sequence 


Amino acid segment containing signal peptide 
{A=Alanine # CeCysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L= Leucine, M^Methionine, N^Asparagine, 
P= Proline, Q=Glut amine , R=Arginine , 
S=Serine, T-Threonine, V=Valine, 
W=Tryptophan, ^Tyrosine, X=Unknown, + *»Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








REVESLEKNMAILDPPDADHLYSAKVMIiM^ 

AEDPQELRDGFQK^ARLVKFIiVGMXGKDEAMAIG^ 

DPEKDPSVLIKT\AIRCCKALTG 


7012 


1 


2S61 


RIUU5SVKRGEARIjFGPTER05ERPIJIPSAARRPBMIjSGKKAAAA 
AAAAAAAATGTEAG PGTAGGSENGS EVAAQPAGLSG PAEVGPGA 
VGERTPRKKEPPRAS PPGGIAE PPGSAGPQAGPTWPGSATPME 
TGIAETPEG\RRTSRRKKAKVEYREMDESIiANLSED£YYSEEER 
NAKAE KEKKL P PP PPQAP PEE EN ES E PEEPSG VEGAAFQSRLPH 
DRMTSQEAACF PDI I SGPQQTQKVFLF I RNRTLQLWLDN P KI QL 
TFEATIiQQLEAP YNSDTVLVHRVHS YIiERHGLINFG I YKRI KPL 
PTKKTGKVI I IGSGVSGliAAAROLQSFGMDVTLL£ARDRVGGRV 
ATFRKGNYVADIjGAMVVTGIjGGNPMAVVSKOVNMEIiAKI KOKCP 
LYEANGQAVPKEKDEMVEQEFiniI^EATSYI*SHQIaDFNVI*NNKP 
VSLGQALEWIQLQEKHVKDEQI EHWKKIVKTQEELKELLNKMV 
NIjKEKIKEI^O^YKEASEVKPPRDITAEFLVKSKHRDI.TAI*CKE 
YDEIAETWKIiEEKIjQELEANPPSDVYLSSRDRQIIJDWHFANLB 
FANATPLSTLSbXHWDQDDDFB FTGSHLTVRNGYSCVPVAIAEG 
liDI KLNTA VRQVRYTASGCEVXAVNTRSTSQTFI YKCDAVLCTL 
PlX3VLKO^PPAVQFVPPLPEWKTSAVQRKGFGNlJ^KVVLCFDRV 

FWDPS VNLFGHVGS TTASRGEL FLFWNL YKAP I IiIAIiVAGEAAG 
I MEN I SDDVI VGRCbAI LKG I FGSSAVPQPKETWSRWRADPWA 
RGSYSYVAAGSSGNDYDLMAQPITPGPSlPGAPQPIPRiFFAGE 
HT I RNYPATVHGALLSGLREAGR IADQFLGAM YTLPRQATPGV P 
AQQSPSM 


7013 


1 


2661 

• 


RRAGSVKRGEARLFGPTERQSERPLRPSAARRPEMLSGKKAAAA 
AAAAAAAATGTEAGPGTAGGS ENGS EVAAQ PAG LS G P AE VG PGA 
VGERT PRKKE PPRAS P PGGLAEP PGSAGPQAG PT WPGSATPME 
TG I AETPEG \ RRTSRRKRAKVEYREMDESLANLSEDE YYS EEER 
NAKAE KEKKLPPPPPOAPPEEENESEPEEPSGVEGAAFQSRLPH 
DRMTSQEAAC FPDI ISGPQQTQKVFLFIRNRTLQLWLDNPKIQI, 
TFEATLQQLEAPYNSDTVIiVHRVHS YLERHGLINFG I YKR IKPL 
PTKKTGKVI I IGSGVSGLAAARQLQS FGMDVTLLEAKDRVGGRV 
ATFRKGNYVADIjGAMVVTGLGGNPMAWSKOVNMELAKI KOKCP 

LY eangqavp kekdemveqe fnrlleatsyi^hqldfnvlnnkp 

VS LGQALEW I QLQEKHVKDEQI EHW KKIVKTQE ELKELLNKMV 
NLKEKI KELHQQ YKEAS EVKP PRD I TAE FL.VKS KHRDL.TALCKE 

ydeiaetcgkleeklqeleanppsdvylssrdrqildwhfanle 
fanatpi>stls lkhwdqdddfeftgshl.tvrng ys cvpvalaeg 
ldi klntavrq vr ytasg cev i avntr s tsqt f i ykcdavlctl 
pi^vlko^ppavqfvpplpewktsavqrmgfgnlnkvvlcfdrv 
fwdps vnlfghvgsttasrgelflfv3nlykapi liju^vageaag 
imeni s ddvi vgrclai lkgi fgssavpqpketvvsrwradpwa 
rgsysyvaagssgndydlmaqpitpgpsipgapqpiprjlffage 
hti rnypatvhgaiilsglreagrladqflgamytlprqatpgvp 

AQQSPSM 


7014 


3 


3350 


DFEVGDKIRILATLEDGWLEGSLKGRTGIFPYRFVKIiCPDTRVE 
ETMALPQEGSLAR I P ETSLDCLENTLG VEEQRHETS DHEAEE PD 
CI ISEAPTSPLGHLTSEYDTDRNS YQDEDTAGGPPRS PGVBWEM 
PLATDS PTSDPTEVVNGISSQ PQVPFHPNLQKS Q YYSTVGGSHP 
HSEQY PDLLPLEARTRDYASIiP P KRMYS QLKTLQKP VLPI> YRG S 
SVSASRVVKPRQSSPQLHNLASYTKKHHTSSVYS ISERLEMKPG 
PQAQGIiVMEAATHSQGDGSTDLDS KLTQQLIEFEJC5 liAGPGTE P 
DKILRHFSIMDFNSEKDIVRGSSKLITEQELPERRKALRPPPPR 
PCTPVSTSPHLLVDQNUCPAPPLWRPSRPAPLPPSAQQRTNAV 
SPKLLSRHRPTCETLEKEGPGHMGRSLDQTSPCPLVI»VRIEEME 
RDLDMYSRAQEELNLMIEEKQDES SRAETLEDLKFCESNIESLN " 
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Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=I»ysine, 
I>= Leucine, M=Me thionine , N=Asparagine , 
P= Proline, Q=Glut amine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
w= Tryptophan, Ys= Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 




■ 




MEUX3LiREOTLI^SQSSSLVAPSGSVSAENPEQRMIjEK3iAKVIE 
ELLQTERD Y I RDLEM C I ERIMVP MQQAQVFNI DFEGLFGNMQMV 
I KVS KQIjIAALE ISDAVG PVFLGHRDELKGTYKI YCQNHDEAI A 
LLEIYEKDEKIQKHIjQDSLADLKSLYNEVJGCTNYINLGSFLIKP 
VQRVMRYPLLI>MEIjI^STPESHPDKVPL1^3AVLAVKEINVNINE 
YKRRKDLVLKYRKGDEDSIjMEKISKL^IHSI ikksnrvsshlkh 
LTG FAPQ I KDEVFEETE KN FRMQERL I KS F I RDLSli YLQH I RE S 
ACVKWAAVS MWDVCM ERGHRDLEQFBRVHRYISDQIiFTNFKER 
TE R1»V I S PIiNQIJ^MFTGPHKI*VQKRroKIiIiDF YNCTERAEKLK 
DKKTLEELQSARIWYEALNAQIjI^ 

AEAHCDFVHQALEQLKPLI^I^KVAGREGNLIAIFHBEHSRVLQ 
QLQVFTFFPES LPATKKPFERKT1DRQSARKPIjLGI*PS YMLQSE 
ELRASLliARYP PEKLFQAERNFNAAQDLDVSIiLEGDIjVGVI KKK 
DPMGSQNRWLI DNGVTKG FVYSS P1»KP YNPRRSHSDAS VGSHSS 
TESEHGSSSPRFPRQNSGSTLTFNPN\S\MAVSFTSGSCQKQPQ 
DASPPPKEWDQGTX*SASLNPSNSESSPSRCPSDPDSTSQPRSGD 
SADVARJDVKQPTATPRS YRNFORHPE IVGYS VPGRNGQSQDLVKG 
CARTAQAPEDRS TE PDG S EAEGNQVYFAVYT F KARN PNEI*S VSA 
NQKLKIIJEFKDVTGNTEWWI^AEVNGKKGYVPSNYIRKTEYT 


7015 


1842 


513 


RQAWHEWAAPSWRGARLVQSVUIVWQVGPHVARERVI PFSSbL 
GFQRRCVSCVAGSAFSGPRI*ASASRSNGQGSALDHFIiGFSQPDS 
SVTPCVPAVSMNRDEQDVTiLVHHPDMPENSRVI,RVVIjIjGAPNAG 
KSTLSNQLLGRKVFPVSRKVHTTRCQALGVI TEKETQVIIjLDTP 
GI ISPGKQKRIIHIjELSLLEDPWKSMESADLVVVLVDVSDKWTRN 
QLSPQliLRC^TKYSQIPSVLVMNKVDCLKQKSVX, 
VVNGKKLKMRQAFHSHPGTHCPSPAVKDPNTQSVGNPQRIGWPH 
FKE I FMLSAI^QEDVKTLKQYT*LTQAQPGPWEYHSAVLTSQTPE 
EI CANI IREKXLEHLPOEVPYNVQQKTAVWEEGPGGEIjVTQQKL 
LVPKESYVKLLIGPKGHVISQIAQEAGHDLMDIFLCDVDIRLSV 

KXiLK 


7016 


167 


2513 


ILNAPKPPPPRDSVEAVAAKRDTGGGSWGTGMDVSGQETDWRST 
AFT?QKLVSQIEDAMRKAGVAHSKSSKDMESHVFI,KAKTRDEYLS 
IiVARLI IHFRDIHNPOCSQASVSDPMNALQSLTGGPAAGAAGIGM 
PPRGPGQSIiGGMGSLGAMGQPMSLSGQPPPGTSGMAPHSMAVVS 
TATPQTQLQLQQVAAAAAAATARSSSSSSRRRYSSSSSSSNSKQ 
FQAQQSAMQQ\QFQA\VVQQQQQl,\QQQQQQQQHLI KLHHQN'QQ 
QIOXKXK}QIK?RIAQIiQLQO^OX3QQQQQOXKK3QALQAQPPIQQP 
PMQQPQPPPSQAIiP<2QIjQQMHHTQHHQPPPQPQQPPVAQNQPSQ 
LPPQSQTQPLVSO^QAIjPGQ^YTQFPLKFVRAPMVVQQPPVQP 
QVQQQQTAVQTAQAAQMVAPGVQVSQSS IjPMLSSPS PGQQVQTP 
QSMPPPPQPSPQPGQPSSQPNSNVSSGPAPSPSSFLPSPSPQPF ! 
\QSPVTARTPQNFSVPSPGPLWTPVNPSSVMSPAGSSQAEEQQY 
LDKLKQLS KYI E PLRRMINKI DKNEDRKKDLS KMKS IjIiD I I>TDP 
S KRCPLKTLQKCE I ALEKLKNDMAVPTP PPP PVP PTKQQYLCQP 
LLDAVItANI RS PVFNHSLYRTFVPAMTAIHGPP I TAP WCTR KR 
RLEDDERQS I PSVLQGEVAFJjDPKFIjVNIJJPSHCSNNGTVHLI C 
KLDDKDI,PSVPPI*EIjSVPADYPAQSPLWIDRQWQYDANPFLQSV 
HR CMTSRX.I1QLPDKHS VTALLNTWAQS VHQACLSAA 


7017 


1 


1785 


INLGNTCYMN S V I * AiFMATDFRRQVLSLNLNGCNSliMKKLQHL 
FAFLAHTQREAYAPRI FFEASRP PWFTPRS QQDCSE YLRFLLDR 
LHEEEKILKVQASHKPSEIliECSETSLQEVASKAAVX,TETPRTS 
DG EKTXiI EKMFGG KLRTHTRCLNCRSTSQKAEAFTOLSIAFWPS 
YS LEYMS CPDCS QS PS I QDGGLMQASVPGPS EEPVVYNPTTAAF 
ICDSLVNEKTIGS PPNEFYCSENTS VPNESWKI LVNKDVPQKPG 
GETTPSWDLLNYFLAPEILTGDNQYYCENCASIiQNAEKTMQIT 
EEPEYLILIT^FSYDQKYHVRRKILDNVSLPLVLELPVKRITS 
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Amino acid segment containing signal peptide 
{ A = Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, • 
H=Hi i di n*» TsTsoleurinp K=LiVfiine 

I>= Leucine, M=Methionine, N=Asparagine , 

P= Proline , Q=Glutamine, R=Arginine, 

S=Serine , T=Threonine , V=Val ine , 

W- Tryptophan , Y=Tyrosine , X»Unknown, *=Stop 

Codon. /=Dossible nucleotide deletion. 

\=possible nucleotide insertion) 








FSSIiSESWSVDVDFTDI^ENIAKKIiKPSGTDEASCTKLVPYT,T».S 
SVWHSG I S SESGHYYS YARN I TSTDS S YQMYHQSEALALASSQ 
SHLLGRDS PSAVFEQDLENKEMS KEWFLFNDSRVTFTSFQSVQK 
ITSRFPKDTAYVLLYKKQHSTNGLSGNNPTSGLWINGDPPLQKB 
LMDAITKDNXLYLOEOELNARARAIX5AASAS CS FRPNGFDDNDP 
PGSCGPTGGGGGGGFNTVGRIiVF 


7018 


484 


1066 


SLVFRGNTWSGEAGHHCSALFNLAAYHQLFVGTERIRAPEI I FQ 
P S L I GEEQAG IAETLG; Y ILDR Y PKDVQEMLVQNVFLTGGNTMY P 
GMKARMEKEI^EMRPFP^SFQVQlJVSNPVLDAWYGARDWAIJiHL 
DDNEVW ITRKEYEEKGGE YXJCEHCASN I YVP I RL PKQASRS SDA 
QASSKGSAAGGGGAGEQA 


7019 


1048 


335 


APGG FliVTMVFPAPS P PWMLG CCS HEVTAGP PTLCKDMS ALVAA 

RMRHIPLAPGSDWRDLPNIEVRIjSIXSTMARKIiRYTHH^ 

S SGALRG VCS CVEAGKACDPAARQ FNTL I PWCLPHTGNRHNHWA 

GLYGRLEWDGFFSTIVTNPEPMGKQGRVIaHPEQHRVVSVRECAR 

SQGFPDTYRLFGNI LDKHRQVGNAVPPPLAKAIGLEI KLCMLAK 

ARESASAKIKEEEAAKD 


7020 


1 


2154 


FADSKRKSVLLDKI KNLQVALTS KQQSLETAMS FVARNT FKRVR 
NGFU^RKVAVFFShrrPTRASPQIiREAVIiKLSDAGITPL.FL.TRQE 
DRQL INALQ rNNTAVGHAIiVLPAGRDLTDFLENVLTCHVCLD I C 
NIDPSCGFGSWRPSFJ^RRAAGSDVDIDMAFIIJDSAETTTLFQF 
NEMKKYXAYIiVRQIiDMSPDPKASQHFARVAVVQHAPSE 
MPPVKVEFSLTDYGS KEKIiVDFLSRGMTQLQGTRALGSAI EYTI 
ENVFESAPNPRDLKIWIJ4LTGEVPEQQLEEAQRVI LQAKCKGY 
FrVVI^IGivKVNIKivvi I r AiKJrWLivr cxsJUV Uxvo A d*M b&lrljriK 
FGRLLPS FVS S ENAF YLS PDIRKQCD W FQGDQ PTKNLVKFGHKQ 
VNVPNNVTSSPTSNPV*rrrKPVTTTKPVTTTTKPVTTTTKPVTI 
INQ P SVKP AAAK P AP AfCP VAAKP VATKTATVRP P VAVKP ATAAK 
P VAAKPAAVRP PAAAAAKP VATKPEVPRPQAAKPAATKPATTKP 
MVKMSREVQVFEITENSAKLHV^PEPPGPYFYDLTVTSAHDQS 
LVLKQNLT VTDRVIGGLLAGQTYHVAVVC YLRSQVRATYHGS FS 
TKTC fi DPPPPOP A£S AS £ S TINLMVSTEPI±AIjTETDICK1iPKDEG 
TCRDFI LKWYYDPNTKS CARFWYGGCGGNENKFGSQKECEKVCA 
PVLAKPGVI SVMGT 


7021 


2 


338 


VNAVSFFPNGYAFATGSDDATCRIjFDIjRADQEIjLLYSHDNIICG 
ITSVAFSKSGRIiLLAGYBDFNOTVAnXTLK^ 
CLGVTDDGMAVATGS WDS FLRIWN 




• 




VYTG^FWSHPLLI PDNRKLFEAEEODL>FRDIOSL.PRNAAl»RKLN 
DLI KRARLAKVHAYI I S SLKKEMPS VFGKDNKKKELVNNIiAEI Y 
GR I EREHQ I S PGDFPNI*KRMQDQLQAQDFSKFQ P LKSKLLEWD 
DMLAHD IAQLMVLVRQEE S QRP I QMVKGGAFEGTLHGPFGHG YG 
FX3^EGIDDAEWVVARDKPMYDEIFYTIiSPVIX5KITCANAK^ 
VRSKLPNS VLGKI WKLAD IDKDGMLDDDEFAIiANHIiI KVKLEGH 
ELPNELPAHLLPPSKRKVAE 


7023 


2 


748 


AMVFGGWPYVPQYRDIRRTQNAIX5FSTYVC1»VI^VANII^IL^ 
WFGRR FES PLLWQS AI M I LTMLLMLKI>CTEVRVANELNARRRS F 
TAADSKDEEVKVAPRRSFIjDFDPHHFWQWSSFSDYVQCVLAFTG 
VAGYI TYLS IDSAtiFVETIIGFXJAVLTEA^fIlGVPQLYRNHRHQST 
EGMSIKMVUWTSGDAFKTAYFXLKGAPLQFSVCGI^ 
ILGQAYAFARHPQKPAPHAVHPTGTKAL 


7024 


1207 


190 


RTGVTGWAQVWMFGGGGVLSSGEQLQMPVKPERGLGPSDGWLV 

SSRRGS PGTVLGLPFWUUTPVLVSRS IRSMLLLTRSPTAWHRLS 

QIJCPPVLPGTLGGQAIiHLRSVTCjLSRQXSPAET 

RIiITGIiFGAGLGGAWIJU*RAEKERIiO^KRTEALRQAAVGG^ 

FHIJ^HRGRARCKADFRGQ^miMYFGFTHCPDlCPDELEKLVQV 

VRQLEAEPGLPPVQPVFITVDPERDDVFJ^MARYVQDFHPRIiLGL 
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to first 
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amino acid 

sequence 


Predicted end 
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to first 
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residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine r C=Cysteine, b=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine. 
L- Leucine, M=Methionine, N=Asparagine , 
PsProline, Q=Glutamine, R=Arginine , 
S =Ser ine , T=Threonine , V=Va 1 ine , 
WsTryptophan, Y=Tyrosine, X=Dnknown , +=*Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








TGS TKQVAQASHS YRVYYNAGPKDEDQDY I VDHS I AI YTaLNPDG 
LFTDYYGRSRSAEQI SDSVRRHMAAFRSVLS 


7025 


232 


832 


ERNSP IGNNENL* K\HSIJ)CliCFRGDWEGNTQFQTLQDNQEECF 
KOVIRTCEKRPTFNQHTVFNiJlQJtt,NT^ 

SDHTQHQL I HTS E KFCGDKECGNTFL PDS E VI Q YQTVHTVKKT Y 
ECKECGKSFSLRSSLTGHKRIHTGEKPPKCKDCXSKAFRFHSQI^S 
VHKRIHTGEKSYECKECGKAFSCG 


7026 


326 


114 6 


NPNPSIGDI KDIKKAAXSF4LDPAHKSHFHPVTPSLVFLCFI FDG 
LHQAT iT iSVG VS KRSNTWGNENEERGTP YASRFKDM PN F I ALE K 
SSVIJ^CCDIiLIGVAAGSSDKJCTSSLQVQRRFKAMMAS IGRLS 
HGESADLLI SCNAESAIGWISSRPWVGELMFTFLFGDFES PLHK 
LRKSS * LPRKHR* QP INAVRMFLDQCMDGS I ALRAI VSEI PVFE 
EKXNNG*KGIGEIF*VWGCTLPPHY>JGAVTTNV^ 
QDEQPHIFG 


7027 


43 


954 


GRRI^QGX^RPEDAEDGAEGGGKRGEAGWEGGYPEIVKENKLFEH 

YYQELKIVPEGEWGQFMDALREPLPATIiRITGYKSHAKEILHCL 

KNKYFKELEDLEWDGQKVEVPQPIjSWYPEEIAWH 

S PHLE KFHQFLVS ETKSGNT SRQEAVSMI PPLLLNVRPHHKILD 

MC^APGSKTTQLIEMIJIADMNVPFPEGFVIANDVDNK^ 

QAKRLSS PC IMWNHDAS S I PRXiQ I DVDGRKE 1 LFYDR I LCDVP 

CSGDGTMRKNIDVWKKWTTIoNSLQLHGLQLRIATRGAEQL 


7028 


189 


608 


SRPPPEPEPGTMVEKGSDSSSEKGGVPGTPSTQSLGSKN KIRNS 
KKMQSWYSMLSPTYKQRNEDFRKLFSKLPEAERLIVDYSCALQR 
EI LLQGRLYLSENWI CFYSN I FRWETT I S I QLKEVTCLKKEXTA 
KLIPNAIQ 


7029 


1343 


40 


VLESNTEAKQATGTSSKLRHGTGQEKGREGPRCPSGLAQIjRLWG 
/PCPHAGRETGPRASAPI PGS*GHGWHW*RKDGRGERSEGPSAL 
SPHSPSIaLNMQQAPTHVGPGMGSQRPRSSVVPEQVGVGSOLSRE 
RWRA*RSI^GAAASERTEMITCERSP/RPCQGYDSSNWFTQPGKK 
TRKRNSRRNTMVSRGGGCLLY PLQS IMPE*QLR* GAHASPPTQG 
R*GKGGPRSPLTKASGTTHI PTPFFGS I P/RPTRDSGPGTDNS\ 
AAPGQKRGHREA * QGPEPV/ WGRVTTHLQGPAG * TKPLGS \ RNW 
VPGPAEGEQGEGAGLEGRP * PLKGCRSTLTFS PQLS I PMVGKKP 
PEGTTAS FFP \RS CHSE * RKP PPSCPHAPALSL PHPLPLPLP PI* 
PLPLPGAGT*HSARSGRPGQSETGSLCHNCHHCPPHCPKCSPGG 
T 


7030 


2 

* 


521 


PVCFSAPGSGOGGKRRVNMELSAVGERVFAAEALLKRRIRKGRM 
EYLVKWKGWSQKYS TWEPEENILDARLLAAFEEREREMELYGPK 
KRGPKPKTFLLKAOAKAKAXTYEFRSDSARGI R I PYPGRSPQDL 
AS TSRAREGLRN \ RVCPRQRAAP APAAP \ PRRG P S G PGPRPG * G 
PGLHFPGPGGPS KHGFVPASEQHQHQQHLPRRGPS GPGPRPG 


7031 


960 


59 


HCSVPGAEWPRKP PAQ I C PQLTSRPHLS S PRSLS PGCGKS PGPG 
/CKPS/RHCDELHEGPSRTAALPCGKPQPKHGVEECG/PCPCLA 
PRRLTE PPALTVSPVGRAAPSGAL* PSGRACSACSHRLAPEAAL 
SAAAPRPSLGSGQNASGLPAASLPPQDSSQPHKTVPS PARSVP P 
LGAQARAAP PRLWC PRALVSG * EASPEAVSVAAGPPVPGPTPS T 
SGSTASHSRRGC* S PR* TPAP PRRDHGRS AAFE VLTAAASAQP C 
ASQGGPRPTGAGRTPSPLGLPFSRGPPAASARPFCRHPSL 


7032 


1393 


2104 


RRPGRTEPVEPPPVP PPPRASNSKSRCR* RNLHLAPL* QSPLRK 
SRQIGTSSLPFGRSAGERPRPAATFCLSRGGSSPVFL*PSSSSL 
EPWMKRQFGRIiHSLFWKSWQKMNSFLLTPKLDTSLMSGWRYRQR 
LPRLHTFLKKSLQMASELAPPLPTPAPLASSLPPPPGPPPLLPV 
PLA*LSRSG ILVPPNSGFSLSC\ PLGDH * GSSGE VRGS CGSPPP 
HHCWVLPPPP* LLLPPR 


7033 


689 


815 


RSRDCLSSSATSNRARRSKCSGPKRATPLDSGPGP*APPGPSSA 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 

nucleotide 

location 

cor i w oyUiivi ,l *»ty 

to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H— Hi<:Hrfin^ T-Tsoleucine. K=I*vsine, 
L=iieucine, M=Methionine , N»Asparagine , 
P=Proline, Q=Glutamine, R=»Arginine, 
S=Serine , T=Threonine f V= Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








LMMPSSCPWRTGAIiGPSPAGSRAIiGRCTSSVGPGSRWLTRTSSP 
GCATRTWRTMRMEPR PLRSRMGESAPGI PAELPSAAPSGPSAPS 
AAAPSAPTTPAAAGPNTL* SRRTAEWCWPPS CS CCWGWC * S WSA 
WDWRRPPLQVS PAPSSSCRASCCWCLES IT* S S STARS RATGAS 
SSSTCPTSRSDRGAAWTP\SPMGAPLLPCSVP1,ISREEAI*0DPR 
NPSP *GVCSGSSGHAGLALGKPPVACSVP 


7034 


92 


1942 


EDTSSMPFRI*I>I PliGIjIiCALLPQHHGAPGPDGSAPDPAHYRERV 
KAMFYHAYDSYLENAFPFDELRPLTCDGHDTV3GS FSLTLIDALD 

t»t t \ ipr m/nrtTT /"*ki\? e?'cir*r\'o\TTj - DTrT j^T>c\7ni7nT fwtkia QfV w^'Pi^IT 
T LiIj\ 1 !_»£• Y r OA LrjNVojc. r yKV Vc V JjyiJla vur UAi/vriMO vrtiiNi. 

RWGGIJ^AHI.I^KKAGVEVEAGWPCSGPIJiRMAEEAARKIjIiPA 

FQTPTGMPYGTVNlXHGVNPGETPVTCTAGIGTFIVEFATIiSSIi 

TGDPVFEDVARVAI^lRliWESRSDIGLVGrWIDVTjTGKWAQDAG 

IGAGVDSYFEYIiVKGAIIJiQDKKLMAMFIjEYNKAIRNYTRFDDW 

YLWVQMYKGTVS MP VFQ SLEAYWPGLQSLIGD IDNAMRTFIjN YY 

TVWKQFGGLPEFYTJIFQGYTVEKREGYPLRPBLI ESAMYLYRAT 

GDPTUjELpGRDAVES I EKI S KVECG FATI KDLRDHKUDNRMES P 

FLAETVTCYIiYLLFDPTNFIHNNGSTFDAVITPYGECILGAGGYI 

FNTEAHP IDPAALHCCQRLKEEQWEVEDLMRBFYSLKRSRSKFQ 

KtrrVSSGPWEPPARPGTLFSPENHDQARERKPAKQKVPIiIiSCPS 

QPFTSKLAIjIX3QVFI^SS*PIiDNFFIFIFIiRLNYNKIiLIiAIIKK 

K 


7035 

• 


92 


1942 

• 


EDTSSMPFRLd-iI PLGLLCAIjDPQHHGAPGPDGSAPDPAHYRERV 
KAM FYHA YDS Y LEN AF P FDELRPLT CDGHDTWG S F S LTL I DAI»D 
Till* \TLFYFQII*GNvSEFQRVVEVI*QD£>VUr UXl)vNJ^t»vr r> l NX 
RWGGLLSAHLLSKKAG^SVEAGWPCSGPLLR^EEAARKLLPA 
FQTPTGMP YGTVNLLHGVNPGETPVTCTAG IGTF I VEFATLS SL 

IGAGVDSYF'EYI*VKGAIIJX}DKKIiMAM 

YLWVQM YKGTVSM P VFQSLEA YWPGLQS L IGD I DNAMRT KLNYY 
TVWTfr>T?nrtT.P"PFVTJT POGYTVEKREGYPIjRPELIESAMYIjYRAT 
GDPTLLELGRDAVESIEKISKVECGFATIKDLRDHKLDNRMESF 
FItABTVKYI*YIiL FDPTNFIHNNGSTFDAVITPYGECI LGAGG YI 
FNTEAH P IDPAALHCCQRIJCEEQWEVEDLMRBFY S LKRSRS KFQ 
KNTVS SG PWE PPARPGTLFS P ENHDQARERKPAKQKVPLLS CP S 
QPFTSKLALLGQVFLDSS* PLDNFFI FI FLRLNYNKLIiLAXIKK 


7036 


442 


761 

* 


" CLAPLFSCFQIINLHI*APSGRlaRWAWIiRGPGRN*I*PGEGPSIPT 
RNW* ERKAGCSQPC/ PAQQHHGRPPGVSPLPRDPHPTTLRPLPP 
PPPPPPPPPRRPPRNRRPG 


7037 


442 


761 


CLiAPItFS CFQ I INI*HLAPSGRI*RWAWLRG PGRN * IiPGEGPS I PT 
RNW*ERXAGCSQPC / PAQQHHGRPPGVSPLPRDPHPTTLRPLP P 
PPPPPPPPPRRPPRNRRPG 


7038 


155 


891 


GAG AASDMS SGLRAADFPRWKRHT SEQLRRRDRI'QRQAFEE 1 1 1> 
Q YNKLLEKSDIiHS VI*AQ KLQAE KHDVPNRHEI SPGHDGTWNDNQ 
LQEMAQLRI KHQEEI»TEIiHKKRGELAQ\RVIDLNNQMQRKDREM 
QMNFJUCIAECLQTISDLETECLDI^TKLCDI^RANQITLKDEYDA 
LQI TFTALEGKIJ^KTTEENQELVTRWMAEKAQEANRiNARE* KR 
LQEAAS PAAERACRS S KGTSTSRTG 


7039 


155 


891 


GAGAASDMSSGIiRAADFPRWKRHISEQLRRRDRLQRQAFEEIIL 
QYNKLLEKSDIiHS V1*AQ KLQAEKHDVPNRHE ISPGHDGTWNDNQ 
LQEMAQLR I KHQEELTELHKKRGELAQ\RVIDLNNQMQRKDREM 
QMNEAKIAECLQTI S DI*ETECI»DIJiTKLCDliERANQTLKDEYDA 
LQ I TFTALEGKLRKTTEENQELVTRWMAE KAQEANRLNARE * KR 
LQEAAS PAAERACRS S KGTSTSRTG 


7040 


34 


789 


KITPPRRPHRCSSGHGSDNSSVIiSGELPPAMGKTAIjFYHSGGSS 
GYESVMRDSEATGSASSAQE>STSENSSSVGGRCRSIiKTPKKRSN 
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SEQ 
ID 
NO: 


Predicted 

beginning 

nucleotide 

location 

co rresponding 

to first 

amino acid 

residue of 

amino acid 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A= Alanine, C=Cysteine, D=Aspartic Acid, E- 
Glutamic Acid, F=*Phenylalanine, G=Glycine, 
H=H±stidine, l=Isoleucine, K=Lysine, 
I>= Leucine, M=Methionine, N=Asparagine, 
P^Proline, Q=Glutamine, R=Arginine, 
S =Ser ine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Dnknown , *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








PGSQRRRLI PAI»SIiDTSSPVRKPPNSTGVRWVDGPLRSSPRGliG 
EPFEIKVY^IDDVERIjQRRRGGASKEAMCFNAKLKILEHRQQRI 
AEVRAKYEWLMKEI*EATKQYI*P1IJ)PNKW1»S EFDLEQVWELDS I*E 
YLEALECVTERLESRVNFCKAHLMMITCFDrT 


7041 


1 


567 


S GRVAMGRRRAPAGGS LGRALMRHQTQRSRSHRHTDS WLHTS Eb 
NDGYD WG RLNLQS VTEQSS IjDD FLATAKLAGTE FV AE KLN I KFV 
PAEARTGLLSFEBSQRI KKLHEENKQFLCI PRRPNWNQNTTPEE 
LKQ AE KDNFLEWRRQ L> \ VRIaE EEQ KX»1 1»T P FERNLD FWRQLWR V 
IERSDIWQIVDA 


7042 


7 


345 


P I HMAAAA LEAD I \ I S P L*F PH I QG Y L.LLS AS HG \ATS LHTKGAIi 
PLETVTMYTVIPKSKYVLVKPDTQYPYSENLDEFKRLAENSASN 
DDLLMAEVAI SDYGDKLTLELREKY 


7043 


2 


2170 


ARGMAARDSDSEEDliVS YGTGIiEPLEEGERPKKP I PLQDQTVRD 
E KGR Y KR FHGAFSGG FSAGYFNTVGSKEGWTPSTFVS S RQNRAD 
KSVLGPEDFMDEEDI^EFGIAPKAIVT-TDDFASKTKDRIREKAR 
QLAAATAP I PGATI*I*DDIiITPAKI*SVGFEXLRKMGWKEGQGVGP 
RVKRRPRRQKPDPGVKIYGCALPPGSSEGSEGEDDDYLPDNVTF 
APKDVTPTOFTPKDNVHGLAYKGLDPHQALFGTSGEHFNLFSGG 
S ERAGDLGE I GLNKGRKhOXSGQAFGV GAI>EEKDDD I YATKTJLiS 
KYOTVXiKDEEPGDGLYGWTAPRQYKNQKESEKDLRYVGKI UDGF 
SLASKPLS S KKI YPPPELPRDYRPVHYFRPMVAATSEKSHLLQ V 
IiSESAGKATPDPGTHSKHQIxNASKRAELIiGETPIQGSATSVLEF 
LSOKDKERIKEMKQATDbKAAQI/KARSLAQNAQSSRAQPS PAAA 
AGHCSWNMAIiGGGTATLKASNFKPFAKDPEKQKRYDEFLVHMKQ 
GQKDALERCI*DPSMTEWERG31ERDEFARAALLYASSHST1»SSRF 
THAKEEDDSDQVEVPRDQENDVGDKQSAVKMKMFGKIiTRDTFEW 
HPDKLLFQ/RI*VGIjPRVKRI>KYSVFNFX»TLPETASLPTTQASSE 
KVS QH RGPDKSRXPS R WDTSKHEKKEDS I S E F LRXiAR S KAEPPK 
QQ SSP Li VNXEEEHAPEIjS AN 


7044 


276 


734 


EVYLTDEFAKGRKVADLYELVQYAGNIIPRLYLLITVGWYVKS 
FTQSRKDILKDLVEMCKGVQHPI»RGLFIJ?NYLI^CTRNILPDEG 
EPTDEETTGDISDSMDFVLLNEAEMNKLWVRMQHQGHSRDREKR 
ERERQEIiRILVGTNLVRLSQV 


7045 


3 


513 


LGFKMEALS RAGQEMS LAALKQHD P Y I TS IADI/TGQVALYTFCP 
KANQWEKTDIEGTLFVYRRSASPYHGFTIVNRLNMHNLVEPVNK 
DLEFQLHEPFLI*YRNASLSIYSIWFYDKNDCHRIAKLMADWEE 
ETRRSQQA/RSGQTESQPGQWLQRPQAHRHPGDAEQSQG 


7046 


3 


513 


LGFKMEALS RAGQEMSLAALiKQHDPYITS IADLTGQVALYTFCP 
KANQWEKTD I EGTLFVYRRS AS PYHGFTXVNRIjNMHNIjVEPVNK 
DIiEFQIaHEPFLLYRNASLsS IYSIWFYDKNDCHRIAKLMADVVEE 
ETRRSQQA/RSGQTESQPGQWLQRPQAHRHPGDAEQSQG 


7047 


103 


486 


OJ4KIEKCX3WSEGLTSIKGNCHNFYTAISKI)VTYKEt»KNLIiNSi^ 
IMLIDVREIWEIIjEYQKI PESINVPLDEVGEALQMNPRDFKEKY 
NEVKPS KSDS / 1 V FS YLAG VRS KKALDTA1 SLG FHSYYER 


7048 


92 


627 


FFCLTIJ^SWDYRHHATRRVISSPVFTMEDSGKTFSSEEEEANY 
WKDIAMTYXQRAENTQEEIJIEFC^GSREYEA^ 
RDLiLSENNRLRMELET I KEKFEVQHS EGYRQI SALEDDLAQTKA 
IKDQLQKYI RELEQANDDLERAKRATDHGLSKTFE\QRLN\QAI 
EKKW 


7049 


393 


938 


KRTGSASYGGPPPGLGGPATXASVAGRCSSVGKI PARRCYEDEii 
VP VFEAVGR I Y E LRLMMD FDG Kl^G YAPVM YCHKHEAKRAVREIj 
NNYE 1 RPGRLLGVCCSVDNCRIjFIGG I PKMKKREE ILEE IAKVT 
EGVIJ>VIWASAADKMKNRGIJUJIGVREPPRGCHW1X3RKIjIAWX 

asslwg 


7050 


393 


938 


krtgsasyggpppglggpatxasvagrcssvgkiparrcyedei, 
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SEQ 
ID 
NO: 


Predicted 

nucleotide 
location 
corr e sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 

nu c 1 en t" \ dp 

location 
cor re spondi ng 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 

ani C* =(^\/4f Pin^ TV— 2l ana rt* t ^ r3 e_ 

Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I-Isoleucine, K=Lysine, 
L= Leucine , M=Methionine, N=Asparagine , 
P= Proline, Q=Glutamine # R=Arginine , 
SeSerine, T=Threonine, V=Valine, 
WaTrvptopharii Y— Tyrosine. X- Unknown *=Shot> 
Codon, /^possible nucleotide deletion, 
\apossible nucleotide insertion) 








VPVFEAVGRI YELRLMMDFDGKNRGYAPVMYCHKHEAKRAVREI, 

NNYEIRPGRIjLGVC<^VDNCRI*FIGGIPKMKKREBII*EEIAICVT 
EGVLDVIVXASAADKmNRGLRLRGVPJBPPRGC^I^RKiXAWX 
ASSLWG 


7051 


119 


816 . 


kkmniiae 1 cdnakkgreyaiilgnydssmvyyqgvmqqiqrhcqs 
vrdpaikgkwqqvrqelleeyeqvksivgtlesfkidkppdfpv 
scqdepfrdpavwpppvpaehrappq1rr/rqsrsktseerngr 
SRS PGTCRPST\ pisksekpstsrdkdyrargrddkgrknmqdg 
asdgempkfdgagydkdlvealerdivsrnps ihwddiadleea 


7052 


467 


715 


SCPGRGKMS KJjI^PBEMTSRDYYFDS YAHFG IHEEMJaKDEVRTL 
TYP^S^TYHNKHVFKDKyVlJ3VGSGTGII^^IFAARC^3PRR 


7053 


467 


715 


SCPGRGKMS K^liNPEEMTSRDYYFDSYAHFGIHEEMLKDBVRTL 
TYRNSMYHNKHVTTCDKVVliDVGSGTGlljSMFAARQGPRR 




-L 




RRCRVTOAMEYDEKIiARFRQAHLNPFNKQSGPRQHEQGPGEEVPD 
VTPEEAI^ELPPGEPEFRCPERVMD1jGI>SEDHFSRPVGLFLASD 
VQQLRQAIEECKQV ILELPEQS EKQKDAWRL IHLRIjKI*QEL»KD 
PNEDEPNIRVIiIiEHRFYKEKSKSVKQTCXI KCNTI I WGL I QTWYT 

DYRCAEO^API /CS/DGWPSEARQCCYTGQYYC5HCHWNDLAV 
I P AR WHNWD FE PRKVSR CS MR YTiALMVSRPVLRLRE I N 


7055 


2 


527 


DSRRVSWRSWIJ^/WGKHLO^IWLSMNVJ^FW 
EYHYI^QMLG/ALCLSRASASVLNW 

SQKVPSRRTRJU^LDKSRTFHITCGATICIFSGVHVAAHLVNALN 
FSVNYSEDFVELNAARYRDEDPRKIJjFTIVPGLTGVCMEVVLiFL 
M 






COT 


UOKKVoNKoWliAJNii/ WbKHijLJbXrXWiJonMvJjkLtJ! Wa.1 r 1*Lj XNQv>P 
EYHYLHQMLG/ALCI^RASASVimJtfCSIjIL^ 

sqkvpsrrtrrlldksrtfhitcgati cifsgvhvaahlvnaln 
fsvnysedfvelnaaryrdedprki^fttvpgl 

M 


7057 

• 


1368 


431 

r 


G I YLHVNEKI PRPTCIGDRQENDKENLNLENHRDQEULiHAS CQA 
SGEVPSQASLRGFFTEDEPGCFGEGENI*PEAIjQNIQDEGTGEQL 
S PQERISEKQT^HIjPNPHSGEI4STMWI,EEKRETSQKGQPRAPM 
AQKLPTCRECGKTFYRNSQLIFHQRTHTGETYFQCTICKKAFLR 

ssdfvkhqrthtgekpckcdycgkgfsdfsgi^rhhekihtgekp 
ykctpiceksfiqrsnfnrhqrvhtgekpykcshcgksfswsssl 

DKHQRSHLGKKPFQ^-PVTKLSFPISISQPSHKNTQLHQEEIiCLR 
GYPC 


7058 


1 


469 


FSGFGAVPDAXiGCRMSDLRITEAFL»YMDYLiCFRALCCKGPPPAR 
PEYDLVCI GLTGSGKTS LLS KLCSES PDNWSTTG FS I KAVP FQ 
NAILNVKELiGG ADN I RKYWSRYYQGSQGVT FVLDSASS EDDLEA 
ARN*SCTQLLQHPQLCTLPFLILA 


7059 


1 


1178 


WPAFPRQPAAAAMDAI*LGTGPRRARGCI#GAAGPTSSGRAARTPA 

APMARPSAWLECVCVVTFDLELGQALELVYPNDFRLTDKEKSSI 

CYLSFPDSHSGCLGDTQFSFRMRQ03GQRSPWHADDRHYNSRAP 

VA1X2REPAHYFGWYFRQVKDSSVKRGYFQKSLVLVSRLPFVRL 

FQALLSLIAPEYJFDKIiAPCLBAVCSEIDQWPAPAPGQTLNLPVM 

GVWQVRIPSRVDKSESSPPKQFDQENIiPAPVVLASVHELDLF 

RCFRPVLTHMQTWfEI*O^EPIAVI*APSPDVS^ 

QPLRFCCD FRPYFT IHDSEFKEFTTRTQAP FNWLGVTNPFF I K 

TLQHWPHILRVGBPKMSGDLPKQVKLKKPFKV* RPWDTKP 


7060 


90 


1670 


SVNLPPSI»WPWEEAMDSTKSEPLKGS PEAEDGN I EYKKLVNPSQ 
YRFEHLVTQM KWRLQEGRGEAVYQ I GVEDNGLLVGLAEEEMRAS 
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SEQ 
ID 
NO: 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first 

amino acid I 

residue of 

amino acid 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, OCysteine, D=Aspartic Acid, E= 
Glutamic Acid, PePhenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K= Lysine , 
L=Leucine, M=Methionine, N=Asparagine , 
P= Proline, Q=Glutaroine, R=Arginine, 
S-Serine, T= Threonine, V^Valine, 
W=Tryptophan. Y= Tyro sine, X» Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 






■ 


LKTLHRMAEKVGADITVLRBREVDYDSDM 
QQFLDIjRVAVLGNVDSGKSTIiLGVLT^ 

IrHE I QSGRTSS ISFE ILGFNSKGEVHG INGTQWGQTLRMGW * * * 
RT* DGGRVX'TOjFEIV* MNAIjRGL* TSS APLRKSKGNQI*N* ikng 
VKXKRG^PGNG^PGNSEGVGRAG!Um*GPWAJJGQVVNYSDSR 
TAEEICESSSKMITFIDLAGHHKYLHTTI FGIjTSYCPDCALI.I*V 
SANTGIAGTTREHLGLAIjA1.KVPFFIVVS KI DliCAKTTVERTVR 
QLERVLKQPGCHKVPM1.VTSEDDAVTAAQQFAQS PNVTPIFTLS 
SVSGESLDIiLKVTIiNII^PLTNSKEQEBLMQQLTEFQVDEIYTV 
PEVGTWGGTLSR* IDLIATLPTQPSPIYSKTSWPKGGDPGI 


7061 


364 


710 


Ai^PSPIXSPPCLPVMDPETTLEEPETARIiRFRGFCYQEVAGPRE 
AIJUU^ELCCQWLQPEAHSKEQMLEMIiVIiEQFljG^PPEIQAWV 
RGQRPGS PBEAAALVEGLQHDP *ARMPS PLGPPCLPVMDPETTL 
EEPE^EARIiRFRGFCYQEVMPREAIiARLRELCCQWMPEAHS^ 
nMT.R^^VLF.OFI/7ITiPPEIOAWVRGORPGSPEEAAAXIVEGLOHD 

PGQLLG 


7062 


71 


744 


AKAGTNIiERt^WI^YFFCIPKHKLKSSQKDKVRQFMACTQAGER 
TAI YCXTQNEWRI^EATDSFFQNPDSIiHRESMPJaAVDKJaaiERI* 
YGRYKDPQDENKIGVDGIQQPCDDLSLDPAS I S VLVIAWKFRAA 
TCiCV P<5RIO?F1iDKMTEIjG<I!DSMEKLKALIjPR1^GELKDTAKFKD 
FYQ FTFTFAKNPGQKGLDL * MAGAYWKL.VLS GR FKFLi Yl» WNT FL 
MEHH 


7063 


2 


562 


LRTVPDLPGRRFRAMRTGdRR*PELPPDMNSI,EQAEDI*KAFERR 
I*TE Y IHCLQPATGRWRMLIj I WS VCTATGAWNWLIDPETQKVS f 
FTSlAJNHPFFTI SCI TLIGLFFAG I HKRWAPS I lAARCRTVliA 
EYimSCDDTGKLILKPRPK^Q*QSSLIVMGLKIAPI J RISDTAKS 

H KG FliliRLDM 




inn 




RDTGSDPSSTRRLCST CCTGH * PAE P IAS PHPS RGTCP PASS AS 
SRRTGCWTCPPESGHAQARRSRRASASRWGARGAVRSAVAARGC 
S SRAGRWLETPGRRRGP PAC2AAAAGRLRGPAP * AAPPTASVPAR 
CR C P AARTGAP AAATWIiRRRLSGLRAP ALGRRRS P G P S P KSAA P 
PLLTPLGAGRAGGSRANS 


70££ 


1 


555 


" ATTTHSARRSGRGAAAEAAASAAGGRQKGPDRKAWEGRR'l u rPGG 
RSQSEPKAPPPQKRSEAAFASMAHSPVAVQVPGMQNNIADPEEL 
FTKXiERIGKGSFGEVFKGIDNRTQQWAlKI IDLEEAEDEIEDI 
roEITVLSQCDSSYVTKYYGSYIiKGSKLWIII^^ 

RAGPFDEFQ 


7066 


356 


676 


PGPQRGPWRAREGGHPLDPADHPRAPASIiRSNVRAATMMQICDT 
YNQKHSLFNAMNRPIGAVNmDOyTVMVPSIiLRDVPLADPGIjDOT 

VGVEVGGSGGCLEERTPP 


7067 

• 


152 


973 


KEN I TMATE I GS P P RFFHM PRFQHQ APRQLFY KRPDFAQQQAMQ 

QLTFIXSKRMRKAVNRKTIDYNPSVIK^ENRIWQRDQ 

PDAG YYTTOLVP P IGMLNNPMNQkVTTKFVRTS TNKVKCPVFWRW 

TPEGRRLVTGASSGEFTI»WNGLTFNFETILQAHDS PVRAMTWSH 

NDMWMLTADHGGYVKYWQSNMNN\nCMFQAHKEAI REARFIHNI P 

FSVVPIVMVKLFSKCII/3AEMHG^CQFIXjNFIoHPINTIFFFVFT 

HSPFCWAPF 


7068 


222 


816 


DTMKE YVLLLFLAIiCSAKPFFSPSH lAljKNMMiKDMEDTDDDDD 
DDDDDDDDDDEDNS LFPTRE PRSHFFPFDLFPMCP FGCQCY SRV 
VHCSDLGLTSVPTNIPFDTRMLDLQNNKI KEIKENDFKGLTSLY 
GLI1»NNNKIjTKIHPKAFIjTTKKLRRI»YIiSHNQLSEI PLNLPKSL 

aelrihenkvkkiqkdtfkkk 


7069 


114 7 


1765 


frdhrryfyvneqsgesqwbfpdgeeeeeesqaqenrdetuucq 
tlkdktgtdsnstessetstgsiickes fsgqvs s s slrmplt p fw 
ttjxjswpvlqppiiplemppppppppespppppppppapkmppp 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 

con e spoilt ill, xi ^ 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 

t»w J- JL JU & \» 

amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=» Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 

iJ-UCUU XilC f rl-nc s- 1 ! U/IJi Us- / LV -nopal XIJc # 

P= Proline , Q=Glutamine, R=Arginine, 
S=Serine , T=Threonine , V=Val ine , 
W=Tryptophan, Y=Tyrosine, X=Unknown , *«=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








BKTKKGRKDKAKKS KTKMPSLVKKWQS IQRELDEEDNSS SSEED 


7070 


1 


547 


DGTMKDSEAVQRATAIjIEQRIjAQEEENEKIjRGDARQKIjPMDLLV 
LEDEKHHGAQSAAXQKVKGQERVKKTSI*DLRRE 1 1 DVGG rQNLI 
EIJUOCIUCQiaCRDALAASHEPPPEPEEITGPVDEBTFLKAAVEGK 
MKVI EKFliADGGSADTCDQFRRTAIjHRASLEGHME HiEKIJjDNG 
ATVDFQ 


7071 


2 


921 


ARGTLRAI^TAKKVGKVGANGQKAAGPSADSVTENKIGSPPKTP 
VSNVAATSAGPSNVGTEIjNSVPvKSSPif JjTKVPAi PPHSEN XQx 
FQDPRTQI PFEVPQYPQTG YYPPPP TVPAGVAPCVPRFVRSNNV 
PESSI.PPASMPYADHYSTFSPRDRMNSSPYQPPPPQPYGPVPPV 
PSGMYAPVYDSRRIWRPPMYQRDDI IRSNSLPPMDVMHSSVYQT 
S LRER YNS LDG YYSVACQ P P S E PRTTVP LPRE PCX3HLKTS CEEQ 
IRRKPDQWAQYHTQKAPLVSSTLPVATQSPTPPSTLNRGEGS 


7072 


2 


921 


ARGTLRAIiETAKKVGKVG ANGQKAAG PSADS VTENKIGS PPKTP 
VSNVAATS AGPSITVGTEIjNS VPQKS S PFt»TRV PAYPPHSENIQ Y 
FQDPRTQIPFEVPQYPO^TGYYPPPPTVPAGVAPOTPRPWSNNV 
PESSLPPASMP YADHYSTFS PRDRMNSS PYQPPPPQPYGPVPPV 
PSGMYAPVYDSRRXWRPPMYQRDD1 IRSNSLPPMDVMHSSVYQT 
SLRERYNSIJDGYYSVACQPPSEPRTTVPLPREPCGHLKTSCEEQ 
IRRKPDQWAQYHTQKAPLVSSTLPVATQS PTPPSTLNRGEGS 


7073 


50 


504 


LAHGSFGVSDFPAPAAAPAHTLTSFSGSLSPQFRKPIjGRAPAMP 
LVR YRKWI LGYRCVGKTS LAHQFVEGEFS eajYIjP TVKN I Y SKI 
VTLGKDEFHLHIjVDTAGQDE YS II*P YSFI IGVHGYVLVYSVTSL 
HSFQVI ESLYQKLHEGHGK 


7074 


263 


1003 


VCPVLCSTRQEPGH^SLVTYFGKPTRRKEFLLGHCIAAGKMMIS 
WLF/mYAELVI^VGRVTLGENSRKKMKDCKIjRKKQNERVSRAM 
CAI»I^SGGGVIKAEIENEI)YSYTKI^IGIjDIiENSFSNIIjI*FVPE 

«T OT?M/-»H.T/**»XTVTr»T T E*l TV C T.l C T VTT'C?/-*Y OTTTT C* CTvTT VITOHTTG h 1H7 

YLiDrTlQNGNxr Lit vxCoW£jL»W liJ&JuKX I XLtooiNuXAJCUXi. 

MNATAALE FIjKDMKKIT^RLYIjRPELLAKRPRVDIQEENx^KAL 
AGVFFDRTELDRKEKLTbTES THVEI 


7075 


598 


1005 


NYINFFFRKEYPPHVQKVEINPVRIjSRLQGVERIMKKTEESESQ 
VEPEIKRKVQQKRHCSTYQPTPPLSPAS KKCLTHLEDLQRNCRQ 

AITLKESTGPLLRTS ihqnsggqksqotglttkkfygnnvekvp 

IDII 


7076 


2 79 


1049 


LQSESSNAAEGNEQRHEDEQRSKRGGWS KGRKRKKPLRDSNAPK 
S PIiTGYVR FMNERREQIiRAKRPEVPFPE ITRMLGNEWS KLP PEE 

RQDAARQATHDHEKETEVKERS VFD I P I FTEEFLNHS KAREAEL 
RQLRKSNMEFEERNAAIX^KHVESMRTAVEKIjEVDVIQERSRNT^ 
iioohletlrovlts sfasmplpexgetptvdtidsym 


7077 


3 


1119 


SSMGSNSE I NGI^ALRKTDKYGFIjGGSQ YSGS LKSS I PVDVARQR 
ELKWI^MFSNV^KWLSRRFXJKVKI^C^KGIPSSLRAKAWQYLSN 
SKELLEQNPRKFEELERAPGDPKWIjDVIEKDLHRQFP FHEMFAA 
RGGHGQQDLYRI LKAYTI TOPDEGYCQAQAP VAAVEjLMHMPAEQ 
AFWCXVQICDKY1,PGYYSAGLEAJQIJ^ 

HLRRQRIDPVLYMTEWFMCIFARTLPWAS^RVWDMFFCEGVKI 
I FTlVALVLIJeHTLGSVEKLRS CKJGMYETMEQLRNLPQQCMQBDF 
LVHEVTNLPVTEALIERENAAQLKKWR^ 
RAIHEERRRQQPPLGPSSS 


7078 


483 


767 


FQGQRMAGEQKPSSNLLEQF Il*I»AKGTSGSAliTAI#ISQVIjEAPG 1 
VYWGELLELANVQEIiAEGANAAYXQ^ IANKE 
SLPELY 


7079 


2 


376 


SWEPKRPKEPSGSDGESDGPIDVGQEGQIiSQMARPLSTPSSSQ 
MQARKKRRGI I E KRRRDR INSS I*S EIjRRL VP TAFEKQGS S KUSK 
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SEQ 
XD 
NO: 


Predicted 
beginning 
nucleotide 
location 
corre spondi ng 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine r D=Aspartic Acid, E= 
Glutamic Acid, F» Phenyl alanine, G=Glycine, 
H=Histidine, I*=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine , N=Asparagine , 
P-Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W= Tryptophan, Y=Tyrosine, X= Unknown , +=Stop 
Codon, /=possible nucleotide deletion, 
\aposaible nucleotide insertion) 








AEVTjQr^TVDHLKMIiHATGGTGTHAIiLFQAS FIQQIF 


7080 


200 


595 


VQLP LEAP CIiS ULiS CRDHSGGNRDLS RRHRDCRV YG S PQDG I P Y 
LTHPLOIQDWSVGIUjQIRAIATPGHTQGH^ 
CLFSGI)IXPbSGCGEFPRKREELGEEGHTEVRAATVPWRAlJCP ' 


7081 


213 


506 


A VTEEEMI LNS 1^ LC YKNKIiIZiAPM VR VGTLPMRLIiAIjDYGADI 
VYCEELIDIiKMlQCKRVVNEVI»STVDFVAPDDRVVFRTCEREQN 
RWFQMGTS 


70B2 




1137 


APSRNTMLMAW CRGP VI*I»CIiROGliGTNS FIiHGLGOE PFEGARS L 
CCRSSPRDLRDGEREHEAAQRKAPGAESCPSLPIiSISDIGTGCL 
SSLENI^PTIJIEESSPRE1CTSSGDQGRCGPTHQGSEDPSMI,S 
QAQS ATEVEERHVS PS CSTSRERPFQAGEL 1 LAETGEGETKFKK 
LFRIitn^F^^IiNSNWGAVPFGKIVGKFPGOILRSSFGKOYMLRRP 
ALEDYWliMKRGTAITFPKDINM I JjS MMD INPGDTVLEAGSGSG 
GMSIiFLSKAVGSQGRVI SFEA/RKDHHDIAKKNYKHWRDSWKLSH 
VEEWPDNVDFIHKDISGATEDI KS I/TFDAVALDMLNPH VTLPVF 
YPHLKHGGVCPVYVVN I TQ VI EI»LD 


7083 


115 


541 


RSNAVQLTRMEYAMKSLSLLYPKSLSRHVSVRTSWTQOLLSEP 
S PKAPRARPCRVSTADRS VRKG I MAYST iRPU/T >KVRDT1jMIiADK 
PFFLVI^EDGTTVETEEYFQAIjAGimrFMVLQKGQKWQPPSECG 

TRHPLSIiSHK 


7084 
• 


3 


522 


NSVS VSSQSRFIiASVPGTGVQRSAAADMAASTAAGKQR I PKVAK 

YKLRKRKTFEDNI RKNRTVISNWI KYAQWEESLKEIQRARS I YE 
RALDVDYRN I TLWLKYAEMEMKNRQVNHARN I WDRAI TT1# 


on DC 

/Uop 


*"> A O 


jl *± i» y 


dot nor 'OT3X3C CJt? Q nirrr; aPMBUTTT Wf"iVT ,fVT\7~V~F!ZVTTY < 5 RTV^ Af^f"* 

AELVSFKHPHVANPRI^MASPEEKCMVI^PPYDEMFAAHLRCT 
YAVGNHDF I EAYKCQTVI VQS FLRAFQAHKEENWALPVMYAVAL 
DIJIVFANNAI>0^1*VKKGKSKVGDMIjEKAAEIjI*MS cfrvcasdtr 
AG I EDS KKWGML FLVNQliFiOC Y FKI NKLHI»CK PLI RAI DSSNLK 
DDYS TAORVTYKYYVGRKAMFDSDFKQAEEYLS FAFEHCHRSSQ 
KNKRMIL IYI,LPVT<Nn>IX5HMPTVELI^KYm,MQFAEVTRAVSEG 
NLLLLHEALAKHEAFF I RCG I FL I KLKI I TYRNLFKKVYLLL 
KTHQI>S LDAFLVALKFMQVEDVD I DEVQCIIaANli I YMGHVKGY I 
SHQHQKL WS KQNPFP P LS TGC 


7086 

W 9* W W 


256 


525 


ILAARMGKQNS KLRPEVMQD1.LESTDFTEHE IQEWYKGFLRDCP 
SGHLSMEEFKKIYGNFFPYGDASKFAEHVFRTFDANGDGTIDFR 

EF 


7087 


166 


723 


hSGS SAG KVAAP C VP P SNHEL VP I TTENAP KNWDKGEGAS RGG 
NTRKSI*EDNGSTRVTPS VQPHLQP I RNMSVS RTMEDS CBLDIiVY 
VTER I IAVS FPS TANEEN FRSNIjREVAQMLKS KHGGNYLIi FNLS 
ERRPDITKIiHAKVI>EFGWPDIiHTPAIiEXICS I CKAMDTWUXAHP 
HRCRVbHNKG 


7088 


104 


759 


GTSAA5 PSSLiIiEMAGEI TETGELYS S Y VGLVYMFNLI VGTGALT 
MP KAFATAGWIjVS LVLLVFLGFMS FMTTT FVI EAMAAANAQLHW 
KRMENIjKEEEDDDSSTASDSDVLIRDNYERAEKRPILSVQRRGS 
PNPFE I TDRVEMG QMAS MFFNKVGVNIiFYFCI IVYX.YGDLAIYA 
AAVPFSI^QVTCSATGOTSCGVEADTKYNDTDRCWGPLRRVD 


7089 

• 


33 


1775 


S VCWEDRYLKARMEES P LSRAP SRGG VNFLNVART Y I PNTKVEC 
HYTLPPGT^SASDWIGIFKVEAACVRDYHTFVWSSVPESTTDG 
SPIHTSVQFXJASYIJKPGAQLYQFRYVNRQGQVCGQSPPFQFRE 
PRP^ELVTliEEADGGSDII,LVVPKATVI^NQIJ>ESQQERNDLM 
QLKLQLEGQVTEliRSRVQELERAIATARQEHTBLMEQYKGISRS 
HGEITEERDIl^RQQGDHVARILELEDDIQTISEKVIiTKEVEl»D 
MiRDTVKAL*TREQE KLIX^IiKEVQADKEQSEAELQYAQQENHHI* 
NUDLKEAKSWQEEQSAQAQRIiKDKVAQMKDTLGQAQQRVAEIiEP 
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SEQ 1 
ID 1 

NO: | 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 

nucleotide 

location 

to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A= Alanine, C= Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenyl al amine , G=Glycine, 
H=Histidine. I=Isoleucine, K=I»ysine, 
L= Leucine. M=Methionine. N=*Asparagine, 
P= Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Thre onine , V= Valine, 
W=Tryptophan, Y= Tyrosine, X»Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\«possible nucleotide insertion) 








L^QL^GAQELAASSQQKATLIjGEEIaASAAAARDRTlAEbHRSR 

LKVAEWGKLAELGIJQjKEEKCQWSKERAGL^ 

LS AE I LRIiEKAVQEBRTQNQVFKTEliAREKDSSLVQLS ESKREfi 

TELP^SALRVLQlOSKEQLQEEXQELLiEYMRKIJ^^ 

EDATTE»EEAAVGLSCPAAIjTDSEDE5PEDM1UjHPMAFVSVETO 

ASTiTJiGLE 


7090 


33 


1775 


SVCWEDRYLKARMEESPLSRAFSRGGVNFIjNVARTYIPNTKVEC 
T4V*rT.PPf5TMP€;a«?nwif3 1 FKVKAACVRDYHTFVWSSVPESi 11X5 
SP IHTSVQFQAS YIiPKPGAQLYQFRYVNRQGQVCGQSPPFQFRE 
PRPMDELVTLEEADGGSDI LLVVPKATVLQNQL.DESQQERNDLM 
QLKI^LF^K^ELRSRVQBLERAIAIAR^ 

HGE ITEERD II^RQOGDHVARI LELEDDI QT I SEKVLTKEVBLD 

RLRiyiVKALTPJSQEKLIjGQIiKEVOJu^KEQSEA 

NIJDIjKEAKSWQEEQSAOAQRIiKDKVAQMKDTI^^ 

t trt?r»i or-Tirv'PT.TiTiQQOOVaTT.T/^'RRT JV5^AAAART>RTIAEL»HRSR 

LBVAEVNGKLAELGIrflLKEEKCQWSKERAG 

LSAEILRI^KAVQEERTQNQVFKTELAREKDSSLVQI*SESKRE1» 
tpt.ocrt TJVT.nifPirEOl^EEKQEIiIiEYMRKLEARLEKVADEKWN 
EDATTEDEEAAVGLS CPAALTDSEDES PEDMRLHPMAFVS V fcTO 

HOT T TiTT.P 


7091 


186 


1076 


EGMIiTREHRCGRS EEQELEPWP S P KKARSGRWLRNGFKRKMEE P 
EEPADSGQSLVPVYIYSPEYVSMCDSLAKIPKRASMVHSIilEAY 
AI^QMRrvnCPICVASMEEMATFHTDAYLQHiQKVSQEGDDDHPD 
q tit vrt rrynCPATEG I FD YAAAIGGAT I TAAQCLI DGMCKVAIN 
WSGG WHHAKKDEASG FCYLNDAVLG ILRLRRJCFEStlL YVDLDLH 
HGDGVEDAFSFTSKVmVSLRlCFSPGFFPGTGDVSDVGLGKGRY 

YS VNVP I QDG 1 QDEKYYQ ICER YEPPAPNPGL 


7092 


522 


□ no 


KOGINEDOEESOKPRIiGEGCEPIS KRQMKKL I KQKQWEEQRELR 
KQKRKEKRKRKKLEROCQMEPNS DGHDRXRVRRDVVHSTLRL I 1 
DCSFDXLM 


7093 


| 454 


655 


NFGVSGVELAQQASMVRMS FVIAACQLVLGLLMTS LTES S I QNS 
ECPQLCVCEIRPWFTPQSTYREA 


7094 


2 


508 


FVRSMHWGVGFAS S RPCWDLS WNQS I S FFGWWAGSEEP FS FYG 
D I IAF PLQDYGG IMAG LGS DP WWKXTL YLTGGAIJ^AAAYLLiHE 
LLVTRKQQEIDS KDAI I IiHQFARPNNGVPSLS P FCLKMETYLRM 
ADLP YQNYFGGKLSAQGKMPW I EYNHEKVSGTE F 1 1 


7095 


1 


411 


IASSI»PKMASLI^SDRVIjYI*VQGEKKVRAPI^QI*YFCRYCSELR 
SI^CVSHEVDSHYCPSCLElWPSAEAKLKKimCANCFDCPGCMH 
TLSTRATS I STQLPDDPAKTTMKKAYYLACG FCRWTSRDVGMAD 
KSVGE 


7096 


j 224 


2067 

• 


ETRSIAVQEKPSQAGRFJISSRISFAGALFLTRFI^QEIjIiLNNFC 
SAMSPAPDAAPAPAS IS LFD1»SADAPVFQGLSLVS HAPGE ALAR 
APRTSCSGSGERESPERXLIX^PT^ISEKLFCSTCBQTFQNHQE 
QREHYKIjDWHRFNLKQRLKDKPIJ^SAIiDFEKQSSTGDLSS I SGS 
EDSDSASEEDI^TIJ^RERATFEKLSRPPGFYPHRVLFQNAQGQF 
LYAYROTLGPHQDPPEEAELLIiQNLQSKGPRDCNnn^f^^ 
GAI FQGREVVTHKTFHRYTVRAKRGTAC^LRDARGGPSHSAGAN 
LRRYNEATLYKDVRDLIjAG P S WAKALE EAGT I LliRAPRSGRS L P 
FGGKGAPIiQRGDPRLWDI PLATRRPTFQEMRVIiHKLTTLHVYE 
EDPREAVRLHS PQTHWKTVREERKKPTEEE I RKI CRDEKEALGQ 
NEES PKG^SGSEGEIX3F<)VELELVELTVGT1,DLCESEVIJPKRRR 
RKRNKKEKSRDQEAGAHRTT J ^QTQEEEPSTQSSQAVAAPLGPL 
I^EAKAPGCPELWNALLAACRAGDVG^/lJCI^l^ PADPRVLSL 
I^API^SGGFTLLHAAAAAGRGSVVRLLLEAGADPTVQCQDH 


7097 


| 256 


1228 


IRTKSAATWEAWPQCGREGSRI ITEPCEANAGSRQELQTBR I SS 
FLAAQGDQAFHS GLETNNS NS ELPLRVGLKVAQG S PLMGGQVSA 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A= Alanine , C=Cysteine , D-Aspartic Acid, E- 
Glutamic Acid, F=Phenyl alanine, G=Glycine, 
H=Histidine, I^Isoleucine, K=Lysine, 
I>= Leucine , M=Methionine, N=Asparagine, 
P=Proline, Q=Glut amine, R^Arginine, 
S=Serine, T=Threonine, V«Valine, 
W=Tryptophan , Y= Tyrosine, ^Unknown, *=Stop 
Codon, /=« possible nucleotide deletion, 
\=possible nucleotide insertion) 






* 


SNS F^RIiHCRIIANEDWMSALCPRLVfDVPLHHI^S I PGSHDTMTYC 
IiNKKSPISHEESIUiLQLLNKALPCITRPVVLKWS 
LDAGVRYLDLR LAHMLEGSEKNLH FVH>3VYTTAI>VEDTLTE I SB 
WLERHPREWI LACRNFEGLSEDLHEYLVACI KNI FGDMLCPRG 
EVPTLRQ LW SRGQQV IVS YEDE S S LRRHHELW PGVPYWWGNR VK 
TEAI»IRYlxETMKSCGR 1 


7098 


82 


956 


SS FLKRCRKVI^C^GIPSEQSLFSTLEEPRDKEIDNYCVMRLQT 
EARSGFWAPNRFP VN I CRMTAVIX3DRGGSSRBTCRCHFH PSLEA 
LVI^LQDWQPGGVGI CTS FLGISWALLDYHRALRTCLPS KPLXiG 
LGSSVIYFLWNI*LI^WPRVLAVALFSALFPSYVAI^FTX3LWLVL 
LLWVWLQGTD FM P D P S S E WL YRVTVAT I L YF S WFNVAEG RTRGR 
Al IHFAFLLSDS ILLVATWVTHSSWLPSGI PLQLWLP VG CGCFF 
LGLALRLVYYHWIOTSCCWKPDPDQVD 


7099 


992 


210 

- 


LFRLAPGFLRS LARQG YHQ I WAFP FLPSGAT ATW P AASRS RSLA. 
ARSLPRSPARPGPNDAIiIKSEHDFRGQGVRAQRFRFSEEPGPGAD 
GAVLE VHVPQ I G AGVSLPG I LAAKCGAEVI LSDSSELPHCLEVC 
R0SCQMNTUjPH1iQVVGI<TWGHISWDI>I«ALPP0DI ILASDVFFEP 
EDFED I LAT I YFLMHKNP KVQLWS T YQVRS ADWSLEALLYKWDM 
KCVHI PLES FDADKEDIAESTLPGRHTVEMLVI s fakds l» 


7100 


205 


671 


ANGG FWEAAPGS EVSI»PLWVPTASHS KTTAIX3 IGSAPPPHLSVL 
FLFSFPPQIXSDPLEAFPVFKKYDRNGIiNVSIECKRVSGIjEPATV 
DWAFDLTKTNMQTMYEO^E WGWKDREKREEWTDDRAWY1* I AW EN 
SS VP VAFSHFR FDVERGDEVLYW 


7101 


2 


503 


WRGGPRRAIOUAGGAVGWVIiVRGVHSVRAGGGRPPRAADMKKD 
VRI LLVGEPRVGKTSIi I MS LVSEEFP EEVPPRABE ITIPADVTP 
ERVPTHIVD YSEAEQSDEQLHQE I SQANV I CI VYAVNNKHS I DK 
VTSR WI PI*INERTDKDSRl»PI*ILGGNKSDL.VBYSR 


7102 


2 


503 


WRGG P RRAKRLAGG AVG WVLL VRGVHS VRAGGGR P P RAADM KKD 
VRILLVGEPRVGKTSI/TMSLVSEBFPEEVPPRAEEITIPADVTP 
ERVPTHIVDYSEAEQSDEQLHQEISQANVICI VYAVNNKHS IDK 
VTSRWI PLINERTDKDSRLPLILGGNKSDLVEYSR 


7103 


119 


438 


GSQS S VAVN IRSGTDEESMDLMNGQAS S VN IAATAS EKS SS S ES 
LSD KG S ELKKS FT)AVVFDVLKVTPEE YAGQ ITLMD VP VFKAI QP 
DELS SCGWNKKEKYS SAP 


7104 


1670 


795 


RLWE^SVSAGASGWGIiSSPGOiLLHPSLPEEERVDIIilNNAGV 
MRCPHWTTEDGFEMQFGVNHLGEAWAGAAP WVQAI LPRRPPKVL 
GF*V* VKSDLFI I LNPGHFLLTNLLLDKLKASAPSRI INLSSLA 
HVAGH IDFDDLNWQTRKYNTKAAYCQS \KLAIVLFTKEI>SRRLQ 
GSGVTVNALHPG VARTET»GRHTG IHGS TFLQHHN \ WAHLLAAWS 
KS PRS WPAPAQHNTLAVAEELA\ VI SG KYFDGLKQKAPAPEAED 
EEVARRiWAESARIiVGLEAPSVREQPIjPR 


7105 


765 


143 


GQMCRR PS PKS TS CLiSMTCDLP / RGLQD PQ CXiAIjFRVAVDKHQA 
LliKAAMSGQGVDRHLFALYIVSRFLHLQSPFLTQVHSEQWQLST 
SQI PVQQMHLFDVHNYPDYVSSGGGFGPADDHGYGVS YI FMGDG 
MITFHI SSKKSSTKTDSHRIX3QHIEDAI*LDVASLFQAGQHFKRR 
FRGSGKENSRHRCGFLSRQTGASKASMTSTDF 


7106 


14 


1064 


GLQAGH PHPRSASRI PEADTH\ YSKLQRAFDS IVNKDHKRM FGT 
YFRVGFFGSKFGDLDEQEFVYKEPAITKLPEISHRIiEAFYGQCF 
GAEFVEVIKDSTPVDKTKLDPNKAYIQITFVEPYFDEYEMKDRV 
TYFEKNFNLRRFMYTTPFTLEGRPRGELHEQYRRNTVLTTMHAF 
PYIKTRI S VIQKEEFVLTP IEVAIEDMKKKTLQLAVAINQEPPD 
AKMLQMVLOGSVGATVNQGPLEVAQVFIAE I PAD P KL YRHHNKL 
RLCFKEFIMRCGEAVEKNKRI/I TADQREYQQELKXNYNKLKENL 
RPMIERKIPELYKPIFRVESQKRDSFHRSSFRKCETQLSQGS 


7107 


1145 


591 


* I * WXiQTGKKK 
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SEQ 1 
IP 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
co iirsc 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corre spending 
to first 

CUUJL, IIP dLIU 

residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
{A"Alanine , C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F~ Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L= Leucine , M=Methionine, N=Asparagine , 
p-prnl i dp O— Glut" amine RsArainine. 

S=Serine, T=Threonine, V= Valine, 
W=Tryptophan , Y= Tyrosine , X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, . 
\=possible nucleotide insertion) 


7108 


1 


942 


VKVAT.T J ,TNI*EQPRTESEWENSFTI,KMFIjFQFVNIxNSSTFy IAF 
FbGRFTGHPGAYl»RLINRWRI£ECH PSGCLXDIt CMQMG 1 1 MVL.K 
rvTWNWTTMPT^YPT .TDNWWFRRKVROEHGPERKT SFPOWEKDYNTj 
QPMNAYGX.FTJF^EMILQFGFTT1FVAAJ^ 

RIJDAYKFVTQWimP3J\SRAKDIGIWYGILEGIGII*SVITNAFVI 
AITSDFI PRLVYAYKYGPCAGQGEAGQKCMVGYVNASLSVFRI S 
DFFJ^RSEPESIX3SEFSGTPIjKYCRYRPYRDPPHSIjVPYGYTIjQF 


7109 


964 


102 


wdqrkrnslvpgpahgpaqeepwekkeslgaaqealsiq1x)pke 
tqpppkseqvylhfi^vvtedgpepkdkgslpqppitevesqvp 
s e klatdts tfeats egtl elqqrn p kaerlrw s p aqee s f rqm 

V\7T77VT?T D^rrKTmT^CLQECGKTFTYNSHLVVHORVHSGEICPYKC 
SDCGKTFKQSSNLGQHQR I HTGEKP FECNECG KAFRWGAHLVQH 
QRIHSGEKPYECNECGKAFSQSSYLSQHRRIHSGEKPFICKECG 

ISA I v» W >Jo cli JL KilrCK V tl/ul I\J& f Oil 


7110 


96 


697 


RLDNFSGFIjVEVTKEERHIVKPIiYDRYRIjVKQMLTRASITPVLG 
SPSTKRRGGMI^PIIEGETAHFFEEIKEEEEDGVNLSSELGDML 
KTAVQV^SSLKNSESDVEENQEIG^AIJDLR^ 
LWKARAEKKKI>RKTr^FEEAFYQQNGRKAQKEDRVrVLEEYRE 
YKKI KAKLRLLEVLISKQDSSKSI 


7111 


2 


414 


GSGLYRGPTPGGQCIWKPNSMPPDHERNFGFTQFALEIjNEI.TAE 
LiKRS I* PSTDTRIiRFDQR. * JjoKoN IQivUS/^JS-KK. J. c UiXJKiJivrus. v 
MEET^IVHGARFFRRQTTSSGKEWWVTNN^ 

GAVLW 


7112 


! 103 


495 


PRCFPVADRGRLIGGLPDWTIMEGKTLNLiTCTVFGNPDPEVIW 

fkndqdiqlsehfsvkveqakyvsmtikgvtsedsgkys INI kn 

KYGGEKIDVTVSVYKHGEKlPDMAPPQQAKPKIilPASASAAGQ 


7113 


1 


824 


KCLRQAWHEAPSSIiAFTRWCSREERAEGGGNIjHRS xtrdpkppg 

LRPSQRPMDDKKKKRSPKPCIJVQPAGJ^PGTLRRVPVPTSHSGSL 

AIXSLPHLPSPKQRAKFKRVGKEKGRPVliAGGGSGSAGTPLQHSF 

LTEVTDVYEl^GGLUJIjI^NDFHSGItt^QAFGKECSFEQI^HTOE^ 

QEKIJ\RI*HFSIJ3VCGEEEDDEEEFJXrVTEGI^ 

DQLLSNI/SSCLGALVPGG^GGEGTYSQSHSWALGEKVGVHGSK 

O Cf2T>T.WT .DOT? 


7114 


3 


14 92 


vwevdeqidhykesqdkfi*wciyvpigke^i»kdesgqecki crki 
iylntdfvsvkqrlpkyyswercskhhlnft^qnrsyvrkkddg 
ckaywkvclhynlhkaqpaerf fdpnqrgkalhqkqalrks qrs 
qtgeklyxcttecgkvfiqkanlvvhqrthtgek^yeccecaicaf 
sqkstl iahqrthtgekpyecseogktfiqkstl i khqrthtge 

KPFVCDKCPKAPKSSYHIjIRHEKTHIRQAFYKGIKCTTSSLIYQ 
RIHTSEKPQCSBHGKASDEKPSPTKHWRTHTKENIYECSKCGKS 
FRGKSHL»S VHOR I HTGEKP YECSI CGKTFSGKSKLS VHHRTHTG 
EKPYECRRCGKAFGEKSTIjIVHQRMHTGEKPYKCNECGKAFSEK 
SPLIKHQRIHTGERPYECTDaaCAFSRKSTIilKHQRIH'i'UKKPY 
KCSECGKAFSVKSTLIVHHRTHTGEKPYECRDCGKAFSGKSTLI 

KHQRSHTGDKNL 


7115 


1 


947 


NAAHG YNWGLWCMY I IPPQDWLDRGDESAP IRTPAMIGCS FWD 
REYFGDIGLLDPGMEVYGGENVKLGMRVWQCGGSMEVLPCSRVA 
HIERTRKPYNNDIDYYAKRIlAIiRAAE\mMDDFlCSHVYMAWNIPM 
SNPGVDFGDVS ERLALRQRIiKCRS FICWYLFJIVYPEMRVYNNTIjT 
YGEVRNS KASAYCLDQGAEDGDRAI IiYPCHGMSSQIjVRYSADGL 
LQLGPLGS TAFLPDSKCLVDIXJTGRMPTLKKCEDVARPTQRLWD 
FTQSGPIVSRATGRCLEVEMSKDANFGIJILVVQRCSGQKWMIRN 
WIKHARH 


7116 


866 


95 


RVRMRRNAEVIEEKIiSMKSWAKFRPGEPWKGYPNIDPETDPYVT 
PGSVZKNliS INTVRF^HIiRDRNSGSSSSLNTTIjPSTSAWSS ir 



598 



0153312A1 I > 



WO 01/53312 



PCTAJSOO/34263 



SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of j 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corr e spond ing 
to first 
amino acid 
residue of 
amino acid 
seepienc& 


Amino acid segment containing signal peptide 
<A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I*=Isoleucine, K= Lysine , 
L=Leucine, M=Methionine, N-Asparagine , 
P^Proline, Q=Glutamine, R=Arginine, 
S=Serine, T= Threonine , V=Valine, 
W=Tryptophan , Y=Tyrosine, X=Dnknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








ASNYI^>I>SSTAQSTSARNSDSKL1WSPGSVTNTSLAKEIjWKVP 
IiPPKNITAPSRPPPGLTGQXPPLSTWDNSPIiRIGGGWGNSDARY 
TPGSSWGESSSGRITNWLVLKI^TPQIIG 
FHLNIjPHGKALVRYSSKEEVVKAOKSLHI SDLFLLTL 


7117 ; 


695 


1261 


LIiISTPGGCHPPPSSIEFTYTGAWGKALPAPHI4PCAPGAI,PQGA 
FVSQAARAI PLLQPSQAAQAEGLSQPARACGADCSIjPWPIiRNWG 
SPILRLPGGLRTPTNDRKTRTRSAMACWARAQWmT^PLKX^TO 
GKVCI>RHPR PTG VllGGPGAAGRO/SGMGTRRRGTFTSGARDPGGIj 
RVKHRCQPTGHTiP 


711B 


49 


1863 


PHCEPNPGAGAMVIiLHVLFEHAVGYAIJjALKBVSEIS llqpqve 

esvlnlgkfhs ivrlvafcpfassqvalenanavsegwhedlr 
t.tj.kthlpskkkkvij/3vgdpkigaaiqeblgyncgt«gviaei 

LRGVRLHFHNLVKGTjTDI^ACKAQIXSI^ 

nmi iqs islldqldkdintfsmrvrewygyhppelvki indnat 

YQRLAQFIGNRREIiNBDKIjEKIjEELTJ'IDGA 

MDISAIDLINIESFSSRWSIiSEYRQSLHTYLRSKMSQVAPSnS 

ALIGEAVGARLIAHAGSLTNLAKYPASTVQI1X5AEKALFRALKT 

RGNTPKYGLI FHSTFIGRAAAKNKGRISRYIiANKCSIASRIDCF 

SEVPTSVFGEKIjREQVEERLSFYETGE I PRKNLDVMKEAMVQAE 

EAAAEITRKLEKQEKKRIiKKEKKRLAALALASSEMSSSTPEECE 

EMSEKPKKKKKQKPQEVPQENGMEDPSISFSKPKKKKSFSKEEL 

MSSDLEEXAGSTSIPKRKKSTPKEETVNDPKEAGHRSGSKKKRK 

FSKEEPVSSGPEEAAGKSSSKKKKKFHKASQED 


7119 


49 


1863 


PHCE PNPG^GAMVLljHVLFEHAVGYAliAIiKEVEE I SLL PQVE 
KSVLNIjGKFHS IVRIjVAFCPFASSOVALENANAVSEGVVHEDLiR 

L1^ETHLPSKKKK\7I*UJVGDPKIGAAXQEELGY1JCOT 
LRGVRLHFHNLVKGLTD1*SACKAQLGLGHSYSRAKVKFNVNRVD 

NMI IQSISLLDQI»DKDINTFSMRVREWYGYHPPELVKI INDNAT 
YCRLAQFIGNRRELNEDKLEKLEELTMDGAKAKAILDASRSSMG 
MDI SAIDLINIESFSSRWSLSEYRQSLHTYLRSKMSQVAPS LS 
AL I G EA VG ARL I AHAG S L TNIiAKY PAS TVQ I LG AE KAL» FRAL KT 
RGNTPKYGLIFHSTFIGRAAAKNKGRISRYLANKCSIASRIDCF 
SEVPTSVFGEKLREQVEERLSFYETGE I PRKNLDVMKEAMVQAE 
EAAAEITRKI^KQEKKRLKKEKKRIAAIJUjASSENSSSTPEECE 
EMSEKPKKKKKQKPQEVPQENGMEDPSISFSKPKKKKSFSKEEL 
MSSDLEETAGSTS IPKRKKSTPKEETVNDPEEAGHRSGSKKKRK 
FSKBEPVSSGPEEAAGKSSSKKKKKFHKASQED 


7120 


1991 


64 


QJ^TRRCLiRGDKVTNAMQDFLiVTNLE PRF I E PQTANIjSVVFKDS 
NSTTPL I FVLS PGTDPAADL YKFAEENKFSKKLSAIS LGQGQGP 
RAEAMMRSSIERGKWVFFQNCHUAPSWMPALERLIEHINPDKVH 

RDFRLWLTSLPSNKFPVS ILQNGSKMTIEPPRGVRANLLKS YSS 
I/SEDFIJJSCTKVMEFKSLLLSLCLFHGNALERRKFGPLGFNI PY 

EFTDGDLRICI S QLKMFIiDE YDD I P YKVLKYTAGE INYGGRVTD 
DWDRRCIMNILEDFYNPDVLSPEHSYSASGIYHQIPPTYDLHGY 
LSYIKSLPIiNDMPEI FGLHDNANITFAQNETFALLGTI IQI*QPK 
SSSAGS QGREEIVEDVTQNI LIjKVPEP INLQWVMAKYPVLYEES 
MNTVI,VQEVIRYNRIjI^VITQTIiQDLLKAIjKGLVVMSSQI*ELMA 
AS L YNNTVPELWSAKAYP SLKPLS SWVMDLLQRUD FLQAW 1 QDG 
I PAVFW I SGFFF PQAFLTGT LQNFARKFVI S IDT IS FDFKVMFE 
APSELTQRPQVGCYIHGIJPLEGARWDPEAFQLAESQPKEIiYTEM 
AV 1 WLIiPTPNRKAQDQDFYLCP I YKTLTRAGTLSTTGHSTNYVT 
AVE I PTHQPQRH W I KRGVAL I CALDY 


7121 


2 


546 


" RPLR PW VLS LGS t^VGLMTYGRRQ FQS1>DTTMRRL I P P FREASAK 
LTTXVDADAEAFTAYLEAMRLPKNTPEE KDRRTAALQEGLRRAV 
SVPLTIJ^ETVASLWPALQEIiARCGNIACRSDLQVAAKALEMGVF 
GAYFNVLINLRDITDEAFKI)QIHK^VSSLUJEAKTQAAI>V1^CL 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 

X Uv <a C -L Oil 

corresponding 
to first . 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 

nucleotide 

location 

to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A= Alanine, C=Cysteine , D=Aspartic Acid, E= 
Glutamic Acid, F» Phenyl al a nine , G=Glycine, 

T-T — TI i efi H< na T — T <S C\\ «=o i r* S T>£* V — T ,\r cz i np 

L= Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine , 
S -Serine , T=Threonine , V=Val ine , 
W=Tryptophan, Y= Tyros ine , X~unJcnown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








ETRQE 


7122 


2 


54 6 


RPI^PWVLSl^SMVGI^TYGRRQFQSLDTTMRRLIPPFREASAK 

U X X Xj V JJHU>\£>Hf X/\XXj£*MfiXVur' X\Xi X rCEiM/KK lltnUUWwKKHV 

SVPLTLAETVASxjWPALQELARCXSNI^ 

GAYFN\TLINLRDI TDEAFKDQ IHHRVS SLLQEAKTQAALVLDCL 


7123 


1 


1092 

» 


KPAVPEARSAGTSEAGRSGAEEVSCGSVSGIXSAAMRIjTPRAxjCS 
AAQAAWRENFPLCGRDYARWPPGHMAKGLKKMQSSLKLVDCI IE 
VHDARI PLSGRNPLFQETLGLKPHI^\^KMDLADLTEOQKIMQ 
HLEGEGLKNVI FTNCVKDENVKQ 1 1 PMVTELIGRSHRYHRKENL 
EYCIMVIGVPNVGKSSLINSLRRQKXRKGKATRVGGEPGITRAV 
MSKI QVSERPLMFLLDTPGVLAPR I ES VETCLKLAIXGTVIxDHL 
VGBETMADYxjLYTIjNKHQRFGYVQHYGIXSSACDNVERVLKSVAV 
KXiG KTQKVKvLTGTGlWNV I QPITreAAARDFI^TFREGLLGSVM 
LDLDVlxRGHPRV 


7124 


2 


382 


LPLTLLIJ^APFAHLIxLPPGHDQSPCWHPGPALS PGTLGPLSWAM 
ANSGLQLLGYFLALGGWVGI IASTALPQWKQSS YAGDAS IQLRS 
KVFVLESEWGGDSLGLPRDCGWSCLLHSAVRSEKGFWS 


7125 

* 


166 


1127 


NCXS E KRNYS F S MQKGKG RTS RI RRRKLCGSSE S RGVNE SHKSE 
PIELRJCWLKARKF^DSNIiAPACFPGTGRGIjMSQTSLQEGQMI I s 
LPESCLLTXPJJWIRSYIjGAYITKWKPPPSPLLALCTFLVSEKH 
AGHRSLLEA\ YLEI LPKAYTCPVCLEPEWNLLPKSLKAKAEEQ 
RAHVQEFFASSRDPFSSLQPI^AEAVDSireYSALLWAWCTVOT 
RAVYL\SPGSGNAFU3SRTPVQI^APYIJDLLNHS PHVQVKAAFNE 
ETHSYEIRTTSRWRKHEEVFICYGPHDNQRLiFLEYGFVSVHNPH 
ACVYVSRGWNQLCS 


7126 


1 


733 


CRDMAAFIVPSPARRCSQKGSI^HLPTQPWIiWAAMSPRGQERGT 
S HS QARE P QR PGR WLLGS Iaj S S PLf A \LiG Q A 05 T AS R KJ<(iCM V U K W V 
QVATGRRAVQ V P KGALGLALGETS PGASRGMSGGAGGCWAIjGWA 
PSPVLPSWLLEGPPPWLS 1 1 SDSGTQRPS PRRCPARPSPWGPQC 
WRGGRIASAEASST* TPGSGSRARSGRRS PGSRRRSASAPS PTP 
PTDACA* SCVARPAGSRSSRPAAA 


7127 


1311 


277 


GLPAMCST*KAGYYEETEGDCIPKDR* IEKRPFKEI *RRIPRI F 
AKQKQI * S * NSQKI GASE IDRGRKEADCS DAPAAAR IGAVSVFR 
RSTQEAR VS PRSNAKSANIjRAVRAD* W EH F VLLFHTPEQ FLAEC 
ICRST**K*WHQLC*PLSSL*TGI.KRKXI^*VI J FRI*WLKDCDV 
* FCQKI FATNFCNWQNLIQ* EE * KPVE YSVEN* HIMNLLLPM+ L 
CQSSLRDQT IVTWRM* RNYSMFRINM I SSL* DGS IHI PLKLHFY 
PALIFTLTVPINSCCQRPLPLFAHQSIKTliASSGSPMLACLRFT, 
LVKKRAFIHTPRS PGCSV* CKHVLVKDN KNNCVG S EV 


7128 


2 


5228 


GRVDLWTILLGRSALRELSQ I EAELNKHWRRIxLEGI*S YYKPPS P 
SSAEKVKANKDVASPLKSl^LRISKFLGI^ 
D YRGTRD S VKTVLQDERQSQ AL I LKI AD YYYEE RTC I LRCVLHL 
LTYFODERHPYRVEYADCVDRIiEKELVS KYROOFEELYKTEAPT 
WETHGWLMTERQVSRWFVQCLREQSMLLEI I FLYYAYFEMAPSD 
IxLVLTKMFKEQGFGSRQTNRHLVDETMDP FVDRIGYFSALILVE 
GMDIESLKKCALDDRRELHQFAQDGLI CQDMDCLMLTFGD I PHH 
APVI^WALUOTl^PEETSSVVRKIGGTAIQx^ 
SLASGG^TOCTTSTACMCVYGLLS FVLTS LELHTLGNQQDI IDTA 
CE VLADPSLPELFV7GTEPTSGLG I ILDSVCGMFPHLLS PLLQLL 
RAL VSGKSTAKKVYSFLDKMS FYNEL YKHKPHDVTSHEDGTLWR 
RQTPKLL YPLGGQTNLRI PQGTVGQVMLDDRAYLVRWE YS YSSW 
TLFTCEIEMLLHWSTADVIQHCQRVKPI IDLVHKVISTDLSIA 
DCLXiPITSRI iT^LIXJRLTTVISPPVDVIASCVNCLTVLAARNPA 
KVWTDxxRHTOFX.PFVAHPVSSLSQMISAEGMNAGGYGi^ 
QPCX5EYGVT1AFLRLITTLVKGQ1U3STQSQGLVPC^1FVIJ^^ 
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SEQ 
ID 

NO: 


Predicted 
beer inning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corre spondi ng 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C= Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Histidine, I^Isoleucine, K= Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P= Pro line, Q=Glut amine, R=Arginine, 
S^Serine, T= Threonine, V= Valine, 
WsTryptophan, Y= Tyro sine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








PSYHKWRYNSHGTOEQIGCLILELIHAILNLCHETDLHSSHTPS 
IiQ FLC I CSLAYTEAGQTVIN I MG I GVDTIDMVMAAQ PRSDGAEG 
QGQGQLLIKTVKLAFSVTNNVIiUiKPPSNWSPLEQALSQHGAH 
GNNL I AVLAXY I YHKHD PALP RX*A1 QLLKRLATVAPMS VYACLG 
NDAAAIRDAFLTRIiQSK\IE\DMRIK\VM1L\EFLTYA\VETQP 
GLIELFI^EVKDG\STCSKEFSU5MW\SCLHAV/VWEIjIDSQQ 
QDRYWCP PLLHRAAI AFLHALWQDRRDS AMLVLRTKPKFWENIjT 
SPLFGTLSPPSETSEPSILBTCALIMKIICLBIYYWKGSIjDQP 

l kdt l kkfs i e krfa yws g yvks lavhvaeteg s s cts lle yqm 
lvs awrmll 1 1 atthad imhltds vvrrql fldvldg tkalllv 
pasvncijo^smkctiallilijrqwkreixssto 
vlqadqqlmektkakvfsaf i tvlqmkemkvs d i pq ysqlvlnv 
cetlqeevi alfdotrhslalgsatedkdsmetddcs rsrhrdq 
rdgvcvijgiihlakelc£rvdedgds wlq vtrrlp i lptl»lttlev 
s i>rmkqni*h fteatlht >t ,t >tiartqqgatavagagi tqsiclpl 
lsvyqi^tngtaqtpsasrksij^apswpgvyrx^smsi^eqllkt 
lr ynflpeald fvgvhqertlqclnavrtvqs lacleeadhtvg 

FlhQl^SNFMl^mFlWPQlMRDlQVmjGYhCQACTSFl^SRKy^ 
QHYTiQNKNGDGLPSAV \ aqrv\qrp PS AASAAPS s s kqpaadte 
AS EQQ ALHTVQ YG LL KI LS KTLAALRHFTPD VCQ I LIiDQ S LDLA 
EYKFLFALS FTTPTFDS EVAPS FGTLIATVNVAIjNMIiGELDKKK 
E PLTQAVGI^TQAEGTRTLKSLIiMFTMENCFYLLISQAMRYLRD 
PAVHPRDKQRMKQELS S EIaSTI*I»S SLSRYFRRGAPS S PATGVLP 
SPQGKSTSLSKASPESQEPLIQLVQAFVRHMQR 


7129 


1 


1054 


FRR FRWRRRLH * AGPAS SAGGS PGEAS GTMS GEliPPNINI KEPR 
WDQSTF I GRANHFFTVTDPRN I I*LTNEQLES ARKI VHD YRQG IV 
P PGLTENELWRAKYI YDSAFHPDTGEKMI LIGRMS AQVPMNMTI 
TGCMMTFYRTTPAVLFWQWINQS FNAWNYTNRSGDAPLTVNEL 
GTAYVSATTGAVATALGLNALTKHVS PLI GRFVP FAAVAAANC I 
N I PLMRQRELKVG I PVTDENGNRLGESANAAKQAI TQVVVSR 1 1* 
MAAPGMAI P P FI MNTI»E KKA FIiKRFP WMS AP I QVGLVGFCLVFA 
TPLCCAIiFPQKSSMSVTSLEAELQAKIQESHPELRRVYFNKGIj 


7130 


2 


780 


HEVPSLQTSDPLPGSVQRCSVWSQPNKENWCQDHLYNSLGRKG 
ISAKSQP YHRSQSS SSVLINKSMDS INYPSDVGKQQLLSLHRSS 
RCES HQDIiPDI ADSHQQGTEKLS DLTIjQDSQKVV\A/NRNIjPIjN 
AQIATQNYFSNFKETDGDEDDYVEIKSEEDESELELSHNRRRKS 
DSKFVDADFSDNVCSGNTLHSLNSPRTPKKPVNSKLGLS PYTjTP 
YNDS DKLND YLWRG PS PNQQNXVQS LREKFQCLSS S S FA 


7131 


805 


573 


AAAEGHIEWKFIj I EACKVNP FAKDRWGNI PJCjDDAVQFNHLEW 
KLDQDYQDSYTLSETQAEAAAEALSKENLESMV 


7132 


1420 


1087 


I DMIjIjIiSGAIiVSGP YTL ITTAVSADLGTHKSLKGNAHALSTVTA 
I IDGTGSVGAAIX5PLIiAGLLSPSGWSNVFYMMFADACALLFLI 
RLIHKELSCPGSATGDQVPFKEQ 


7133 

♦ 


2 


3648 


QQ I PGIjLPAHGESGDAIjRiCPRJjQKPI TGHLDDLFFTL YPS LEKF 
E E ELLELHVQD H FQEG CG PL>DGGALE I LERRLRVG VHNG LG FVQ 
RPQVVVLVPEMDVALTRSASFSRJCVVSSSKTSSGSQAIiVLRSRI. 
RLPEMVGHPAFAVT FQLE YVFSS PAGVDGNAAS VTSLSNLACMH 
MVRWAVWNPIiLEADSGRVTliPIiQGG I QPNPSHCLVYKVPS AS MS 
SEBVKQVESGTLRFQFSLGSEEHLDAPTEPVSGPKVERRPSRKP 
PTSPSSPPAPVPRVLAAPQNS PVGPGLS I SQLAAS PRSPTQHCL 
ARPTSQLPHGS QAS PAQAQEFPLEAG I SHLEADLSQTSLVLETS 
IAEQLQEliPFTPLHAPlVVGTQTRSSAGQPSRASMVLLQSSGFP 
E I LDAN KQ PAEAVS ATEPVTFNPQKEE SDCLQSNEMVIjQFLAFS 
RVAQDCRGTSWPKTVYFTFQFYRFPPATTPRLQIiVQLDEAGQPS 
SGALTHI LVP VSRDGTFDAGS PGFQLRYMVGPG FLKPGERRC FA 
RYLAVQTLQI DVWDGDS LLL I GS AAVQMKHLLRQGRPAVQASHE 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
{A=Alanine, C= Cysteine , D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=J J eucine , M=Methionine, N=Asparagine, 
P= Proline, Q=Glut amine, R=Arginine, 
S=Serine , T=Threonine , V=Val ine , 
W=»Tryptophan, Y=Tyrosine, X=Unknown, +-Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








I^WATEYEQDNMWSGDMLGFGRVKPIGWSVVKGRIJJI.TLAN 
VGHPCEQKWGCSTLPPSRSRVISNDGASRFSGGSLLTTGSSRR 

LERMRS VRMEAGGDIjGRRGTS VIAQQSVRTQHLRDLQV IAAYR 
ERTKAES I AS LI»SI*AI TTEHTLHAT1VGVAEFFEFVLKNPHNT0H 
TVTVEIDNPEI>SVIVDSQEWRDPKGAAGLHTPVEEDMFHIiRGSIi 
APQL YLRP HETAHVP PKPQSFSAGQIiAMVQAS PGLSNEKGMDAV 
SPWKSSAVPTKHAKVIiFRASGGKP I AVLCLTVEIiQPHWDQVFR 
FYHPE1»S FL KKAI RL P P WHT FPGAP VGMIjGED P PVHVRCSDPNV 
ICBTQNVGPGEPRDIFLKVASGPSPEIKDFFVIIYSDRWIATPT 
Q XViQ VXLiH S JjQK VO V S CVAGQIjT. Rl*SLi VI^GTQTVKKVRAl* i. SH 
PQEL KTDPKGVFVLPPRGVQDLHVGVRPLRAG SRFVHLNLVDVD 
CHQLVAS WLVCLCCRQPL I S KAFE I M1AAGEG KGVNKR I TYTNP 
YPSRRTFHLHSDHPELLRFREDSFQVGGGETYTIGLQFAPSQRV 
GEEEII»IY2NDHEBKNEEAFCVKVIYQ 


7134 

* 


2115 


1111 


GGEGFSYPPHVGIjSEX5TPIjDPHYV1»IiEVHYDNPTYEEGLIDNSG 
LRLFYTMDIRXYDAGVI EAGLWVSLFHTI P PGMPEFQSEGHCTL 
ECLEEALEAE K PS G 1 HVFAVLLHAHLAGRG I RLRHFRKG kemkl 
IAYDDDFDFNFQEFQYLKEEQT1LPGDNL,ITECRYNTKDRAEMT 
WGGLSTRSEMCLSYLLYYPRINXiTRCASIPDIMEQIjQFIGVKEI 
YRPVTTWPFI IKSP KQYKN1»S FMDAMNKFKWT KKEGLS FNKIiVlj 
SLPVNVRCSKTDNAEWSIQGMTALPPDIERPYKAEPIjVCGTSSS 
SSLHRDFS IN1.LVCX.LLLS CTLSTKSI* 


7135 


2 


2072 


FVPRVTPRSLSLQGPKGBSVGSITQPLPSSYLIFRAASESDGRC 
WLiDALEbALRCSS LLRLGTCKPGRDGEPGTSPDAS PSSLCGLPA 
SATVHPDQDLFPLNGSSLENDAFSDKSERENPEESDTETQDHSR 
KTESGSDQSETPGAPVRRGTTYVEQVQEEIjGEI^EASQVETVSE 
ENKSLMWTLLKQLRPGMDI^RVVLPTFV1£PRSFI*NKI^DYYYH 
ADLLSRAAVEEDAYSRMKLVLRWYI.SGF YKKPKGI KKPYNPII/3 
ETFRCCWFHPQTDSRTFYIAEQVSHHPPVSAFHVSNRKDGFCIS 

IL YGTMTLEIiGG KVTI BCAKNN FQAQLEFKLKP FFGGSTS INQI 
SGKITSGEEVLASLSGHWDRDVFIKEEGSGSSALFWTPSGEVRR 
QRURQHTVPLEEQTELESERLWQHVTRAI SKGDQHRATQEKFAL 
EEAORORAREROES LiMP WKPOL.FHLDP I TOEWHYR YEDHS PWDP 
LKDIAQFEQlX3ILRTI>QQEAVARQTrFLGSPGPRHERSGPDQRli 
RKASDQPSGHSQATESSGSTPESCPELSDEEQDGDFVPGGESPC 
PRCRKEARRIiQALHEAILSIREAQQELHP^I^A^Il J SSTARAAQA 
PTPGLLQS PRSWFLLCVFLACQLFINHIIiK 


7136 


2 


418 


DFVPSFRRPSGNTSQTVVTLIJ?AATI,EKEVAG 
SQQRKVRQMIEQLQNSKAVI QS KDATI QELKE KXA YIjEAENIjEM 
HDRMEHLIEKQISHGNFSTQARAKTENPGSIRISKPPSPKPMPV 
IRWET 


7137 


2 


466 


WASGMSTVPGGSRHSLGIQVRGGWGVTGGEEESLTVPVADTWQA 
GS FKVATQER1*P<2RAQMRLRRQKKGVV P FLGDFLTELQRXDS AI 
P DDLDGNTNKRS KEVRVLQEMQ LLQ VAAMNYRIiRP LE KFV^ Y FT 
RMEQLSDKESYKLSCQLEPENP 


713 8 


2 


466 


WASGMSTVPGGSRHSliG I QVRGGWGVTGGEE£SLTVPVADTWQA 
GSFKVATQ ERNPQRAQMRI*RRQKKG\A/P FIiGDFLTELQRLDS AI 
PDDLDGirmKRSKBVRVLQEMQL 
RMEQLSDKESYKLSCQIiEPENP 


7139 


1 


357 


S LRNS ARG LKMAASAARGAAALRRS INQ P VAF VRR I PWTAASSQ 
iKEHFAQFGHVRRCII*PFDKETGFHRGLG WVQFSS EEGI*R1^AI*Q 
QENHI IDGVKVQVHTRRPKLPQTSDDEKKDF 


7140 


1401 


1357 


RASSLCVLKAWGGL I PS S FQQQHTGQ YALEELFDIiKVYBC FCS F 
NMNVS LE KQLRPSQ P W P RGKCRKT PGW EEARP KAQ D LRGDLG KT 



602 



WO 01/53312 



PCT/USOO/34263 



1 SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 

corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


i ficujiuLcu eno 
J nucleotide 

location 
j corresponamg 

to first 
1 amino acid 
I residue of 
1 amino acid 
{ sequence 


j Amino acid segment containing signal pepti<3e — I 
(A^Alanine, C=Cysteine, D==Aspartic Acid, E- 
Glutamic Acid, F= Phenyl alanine, G=*Glycine ! 
H=Histadine. I=Isoleucxne, K=Lysine, 
I*=Iieucine, M=Me thionine , N=Asparagine , t 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V^Valine, 
W=Tryptophan, Y= Tyrosine, X= Unknown, *=Stop 
Codon, /=possible nucleotide deletion, j 
\=possible nucleotide insertion) 1 








OAGPAEAHTRGPPRLPAATGCPPHLPGLLSGISVDIDPTGLQSQ 
w i f lUjyuFfijMr SED x QKSLIjEQYHLGIJ)QKLRKyVVGEIIjI WNF 1 
ADFMTNQCG f 


7141 


124 


1073 


LDSRSCW1JDMEDLEEDVRFIVDETLDFGGLSPSDSREEEDITVI, j 
VTPEKPLRRGI^HRSDP1^VAPAPQGVRI>SIX5PLSPEKLEEILD 
k>^KIjAAQIjEQCAI*QDRESAGEGLGPRR j 
VRDLLPTVNSLTRSTPS /LKQPDASTPE * * *EGVSQGSPGYIWK 1 
EALQHEEGVTHIiQSVPCIQKPS I FSS \SRSTPPVRGRAGPSGRA I 
AASEETRAAXIiRGAAAKSSCQLPI PSAI PRPASRMPLTSRSVPP [ 

GRGALPPDSLSTRKGLPRPSTAGHRVRESGHKVPVSQRIiNIiPVM 
GATRSNLQPP j 


7142 


658 




IjIF:LMLHMEIjKMI»SSVTIjHIRAFLYWICLKPTSC^ 
KK*SRAVGWWMCRT/ YSSDLQVGVI KPWLLLGSQDAAHDLDT 
L KKNKVTH I I»NVA YG VENAFLS D FTY KS I S I LDL P ETN I L»S YFP 
ECFEFIEEAKRKDGW1»VHCNA j 


7143 


3 


773 


SLEMSSDGEPLkSRMDSEDSISSTIMDVDSTISSGRSTPAMMNGQ 
GSTTS SS KNIAYNCCWDQCQACFNS S PDLADH IRS IHVDGQRGG 
VFVCLWKGCKVYNTPSTSQSWLQRHMLTHSGDKPFKCVVGGCNA 
S F ASQGGIiARHVPTHFSQQNSS KVS SQPKAKEES PS KAGMNKRR 

HS WFHS TVS I LLFFQI KYKTLQKKTIST 1 1 SXS LK I | 


7144 


1 j 


988 


FRVNMQDGGPSPAEHSKAEESAGMEARFLGLPDAAGSSGPTPAR 

RCPAPRPAGVSYVIRDEVEKYNRNGVNAI^LDPAIiNRLFTAGRD 
S 1 I RI WS VNQHKQDP Y IASMEHHTDWVND I VLCCNGKTLI S ASS 

DTTVKVWNAHKGFCMSTLRTHKDYVKAliAYAXDKELVASAGLDR 
QIFljWDVNTIiTAIjTASNNT^rrTSSLSGNKDS IYSIiAMNQLGTI I j 
VSGSTEKVLRVWDPRTCAKLMKLKGHTDNV^ j 

GSSDGTIRLWSLGWRCIATYRVHDEGWALQVNDAFTHVYSGG 1 
RDRKIYCTDLRNPDIRVXiICE j 



TRADOCS: 14 1 6260. 1 (%CSK0 1 ! .DOC) 
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WHAT IS CLAIMED IS: 

1. An isolated polynucleotide comprising a nucleotide sequence selected from the 
group consisting of SEQ ID NO:l-1786 and 3573-5358, a mature protein coding portion 
of SEQ ID NO:l*-1786 and 3573-5358, an active domain of SEQ ID NO: 1-1786 and 
3573-5358, and complementary sequences thereof. 

2. An isolated polynucleotide encoding a polypeptide with biological activity, 
wherein said polynucleotide hybridizes to the polynucleotide of claim 1 under stringent 
hybridization conditions. 

3. An isolated polynucleotide encoding a polypeptide with biological activity, 
wherein said polynucleotide has greater than about 90% sequence identity with the 
polynucleotide of claim 1 . 

4. The polynucleotide of claim 1 wherein said polynucleotide is DNA. 

5. An isolated polynucleotide of claim 1 wherein said polynucleotide comprises the 
complementary sequences. 

6. A vector comprising the polynucleotide of claim 1 . 

7. An expression vector comprising the polynucleotide of claim 1 . 

8. A host cell genetically engineered to comprise the polynucleotide of claim 1 . 

9. A host cell genetically engineered to comprise the polynucleotide of claim 1 
operatively associated with a regulatory sequence that modulates expression of the 
polynucleotide in the host cell. 

10. An isolated polypeptide, wherein the polypeptide is selected from the group 
consisting of: 
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(a) a polypeptide encoded by any one of the polynucleotides of claim 1 ; and 

(b) a polypeptide encoded by a polynucleotide hybridizing under stringent 
conditions with any one of SEQ ID NO: 1-1786 and 3573-5358. 

11. A composition comprising the polypeptide of claim 10 and a carrier. 

1 2. An antibody directed against the polypeptide of claim 1 0. 

13. A method for detecting the polynucleotide of claim 1 in a sample, comprising: 

a) contacting the sample with a compound that binds to and forms a 
complex with the polynucleotide of claim 1 for a period sufficient to form the complex; 
and 

b) detecting the complex, so that if a complex is detected, the 
polynucleotide of claim 1 is detected* 

14. A method for detecting the polynucleotide of claim 1 in a sample, comprising: 

a) contacting the sample under stringent hybridization conditions 
with nucleic acid primers that anneal to the polynucleotide of claim 1 under such 
conditions; 

b) amplifying a product comprising at least a portion of the 
polynucleotide of claim 1 ; and 

c) detecting said product and thereby the polynucleotide of claim 1 in 

the sample. 

15. The method of claim 14, wherein the polynucleotide is an RNA molecule and the 
method further comprises reverse transcribing an annealed RNA molecule into a cDNA 
polynucleotide. 

16. A method for detecting the polypeptide of claim 1 0 in a sample, comprising: 
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a) contacting the sample with a compound that binds to and forms a 
complex with the polypeptide under conditions and for a period sufficient to form the 
complex; and 

b) detecting formation of the complex, so that if a complex formation 
is detected, the polypeptide of claim 1 0 is detected. 

17. A method for identifying a compound that binds to the polypeptide of claim 10, 
comprising: 

a) contacting the compound with the polypeptide of claim 10 under 
conditions sufficient to form a polypeptide/compound complex; and 

b) detecting the complex, so that if the polypeptide/compound 
complex is detected, a compound that binds to the polypeptide of claim 1 0 is identified. 

18. A method for identifying a compound that binds to the polypeptide of claim 10, 
comprising: 

a) contacting the compound with the polypeptide of claim 10, in a 
cell, under conditions sufficient to form a polypeptide/compound complex, wherein the 
complex drives expression of a reporter gene sequence in the cell; and 

b) detecting the complex by detecting reporter gene sequence 
expression, so that if the polypeptide/compound complex is detected, a compound that 
binds to the polypeptide of claim 1 0 is identified. 

19. A method of producing the polypeptide of claim 10, comprising, 

a) culturing a host cell comprising a polynucleotide sequence selected 
from the group consisting of a polynucleotide sequence of SEQ ID NO: 1-1786 and 3573- 
5358, a mature protein coding portion of SEQ ID NO:l-1786 and 3573-5358, an active 
domain of SEQ ID NO:l-1786 and 3573-5358, complementary sequences thereof and a 
polynucleotide sequence hybridizing under stringent conditions to SEQ ID NO: 1-1 786 
and 3573-5358, under conditions sufficient to express the polypeptide in said cell; and 

b) isolating the polypeptide from the cell culture or cells of step (a). 
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20. An isolated polypeptide comprising an amino add sequence selected from the 
group consisting of any one of the polypeptides SEQ ID NO:1787 -3572 and 5359-7144, 
the mature protein portion thereof, or the active domain thereof. 

21 . The polypeptide of claim 20 wherein the polypeptide is provided on a polypeptide 
array. 

22. A collection of polynucleotides, wherein the collection comprising the sequence 
information of at least one of SEQ ID NO:l-1786 and 3573-5358. 

23. The collection of claim 22, wherein the collection is provided on a nucleic acid 
array. 

24. The collection of claim 23, wherein the array detects full-matches to any one of 
the polynucleotides in the collection. 

25. The collection of claim 23, wherein the array detects mismatches to any one of 
the polynucleotides in the collection. 

26. The collection of claim 22, wherein the collection is provided in a computer- 
readable format. 

27. A method of treatment comprising administering to a mammalian subject in need 
thereof a therapeutic amount of a composition comprising a polypeptide of claim 10 or 20 
and a pharmaceutical^ acceptable carrier. 

28. A method of treatment comprising administering to a mammalian subject in need 
thereof a therapeutic amount of a composition comprising an antibody that specifically 
binds to a polypeptide of claim 10 or 20 and a pharmaceutically acceptable carrier. 
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